3 Writing PyRanges to disk
The PyRanges can be written to several formats, namely csv, gtf, gff3 and bigwig.
If no path-argument is given, the string representation of the data is returned. (It may potentially be very large.) If a path is given, the return value is the object itself. This way the write methods can easily be inserted in method call chains.
import pyranges as pr
# import pyranges_db as pr_db
= pr.data.chipseq()
gr "chipseq.gtf")
gr.to_gtf(print(gr[:10000].to_gtf())
print(gr[:10000].to_gff3())
The to_csv method takes the arguments header and sep.
"chipseq.csv", sep="\t", header=True)
gr.to_csv(print(gr[:10000].to_csv(sep="|", header=True))
All to-methods except to_bigwig
takes an argument chain which can be set to
True if you want the method to return the PyRanges it wrote. It is useful for
storing the intermediate results of long call chains.
pr.data().f1().to_csv("bla", chain=True).merge()...
The pyranges library can also create bigwigs, but it needs the library pybigwig
which is not installed by default. Use conda install -c bioconda pybigwig
or
pip install pybigwig
to install it.
The bigwig writer needs to know the chromosome sizes. You can fetch these using the pyranges database functions.
= pr.data.chromsizes() # hg19, can also use pr_db.ucsc.chromosome_sizes("hg19")
chromsizes "chipseq.bw", chromsizes) gr.to_bigwig(
To create a bigwig from an arbitrary value column, use the value_col
argument.
If you want to write one bigwig for each strand, you need to do it manually.
"+"].to_bigwig("chipseq_plus.bw", chromsizes)
gr["-"].to_bigwig("chipseq_minus.bw", chromsizes) gr[
to_bigwig also takes a flag divide_by
which takes another PyRanges. Using
divide_by
creates a log2-normalized bigwig.