3 Writing PyRanges to disk

The PyRanges can be written to several formats, namely csv, gtf, gff3 and bigwig.

If no path-argument is given, the string representation of the data is returned. (It may potentially be very large.) If a path is given, the return value is the object itself. This way the write methods can easily be inserted in method call chains.

import pyranges as pr
# import pyranges_db as pr_db
gr = pr.data.chipseq()
gr.to_gtf("chipseq.gtf")
print(gr[:10000].to_gtf())
print(gr[:10000].to_gff3())

The to_csv method takes the arguments header and sep.

gr.to_csv("chipseq.csv", sep="\t", header=True)
print(gr[:10000].to_csv(sep="|", header=True))

All to-methods except to_bigwig takes an argument chain which can be set to True if you want the method to return the PyRanges it wrote. It is useful for storing the intermediate results of long call chains.

pr.data().f1().to_csv("bla", chain=True).merge()...

The pyranges library can also create bigwigs, but it needs the library pybigwig which is not installed by default. Use conda install -c bioconda pybigwig or pip install pybigwig to install it.

The bigwig writer needs to know the chromosome sizes. You can fetch these using the pyranges database functions.

chromsizes = pr.data.chromsizes() # hg19, can also use pr_db.ucsc.chromosome_sizes("hg19")
gr.to_bigwig("chipseq.bw", chromsizes)

To create a bigwig from an arbitrary value column, use the value_col argument.

If you want to write one bigwig for each strand, you need to do it manually.

gr["+"].to_bigwig("chipseq_plus.bw", chromsizes)
gr["-"].to_bigwig("chipseq_minus.bw", chromsizes)

to_bigwig also takes a flag divide_by which takes another PyRanges. Using divide_by creates a log2-normalized bigwig.