10 Turning Ranges into RLEs
Ranges can be turned into dicts of run length encodings with the coverage function:
import pyranges as pr
gr = pr.load_dataset("aorta")
print(gr)
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int64) | (int64) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 9916 | 10115 | H3K27me3 | 5 | - |
## | chr1 | 9939 | 10138 | H3K27me3 | 7 | + |
## | chr1 | 9951 | 10150 | H3K27me3 | 8 | - |
## | ... | ... | ... | ... | ... | ... |
## | chr1 | 10241 | 10440 | H3K27me3 | 6 | - |
## | chr1 | 10246 | 10445 | H3K27me3 | 4 | + |
## | chr1 | 110246 | 110445 | H3K27me3 | 1 | + |
## +--------------+-----------+-----------+------------+-----------+--------------+
## PyRanges object has 11 sequences from 1 chromosomes.
print(gr.coverage())
## chr1
## +--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------+
## | Runs | 9916 | 23 | 12 | 2 | 25 | ... | 80 | 114 | 5 | 99801 | 199 |
## |--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------|
## | Values | 0 | 1 | 2 | 3 | 4 | ... | 3 | 2 | 1 | 0 | 1 |
## +--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------+
## Rle of length 110445 containing 22 elements
## Unstranded PyRles object with 1 chromosome.
print(gr.coverage(stranded=True))
## chr1 +
## --
## +--------+--------+------+------+-------+------+------+------+-------+---------+-------+
## | Runs | 9939 | 14 | 71 | 114 | 14 | 71 | 23 | 199 | 99801 | 199 |
## |--------+--------+------+------+-------+------+------+------+-------+---------+-------|
## | Values | 0 | 1 | 2 | 3 | 2 | 1 | 0 | 1 | 0 | 1 |
## +--------+--------+------+------+-------+------+------+------+-------+---------+-------+
## Rle of length 110445 containing 10 elements
##
## chr1 -
## --
## +--------+--------+------+------+------+-------+---------+------+------+------+------+-------+
## | Runs | 9916 | 35 | 27 | 23 | 114 | ... | 27 | 23 | 41 | 85 | 114 |
## |--------+--------+------+------+------+-------+---------+------+------+------+------+-------|
## | Values | 0 | 1 | 2 | 3 | 4 | ... | 3 | 2 | 1 | 2 | 1 |
## +--------+--------+------+------+------+-------+---------+------+------+------+------+-------+
## Rle of length 10440 containing 12 elements
## PyRles object with 2 chromosomes/strand pairs.
You can also create coverage for an any numeric value in your PyRanges:
print(gr.coverage("Score"))
## chr1
## +--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------+
## | Runs | 9916 | 23 | 12 | 2 | 25 | ... | 80 | 114 | 5 | 99801 | 199 |
## |--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------|
## | Values | 0 | 5 | 12 | 20 | 25 | ... | 11 | 10 | 4 | 0 | 1 |
## +--------+--------+------+------+-----+------+---------+------+-------+-----+---------+-------+
## Rle of length 110445 containing 22 elements
## Unstranded PyRles object with 1 chromosome.