25 Turning Ranges into RLEs
Ranges can be turned into dicts of run length encodings with the to_rle function:
import pyranges as pr
= pr.data.aorta()
gr print(gr)
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int32) | (int32) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 9939 | 10138 | H3K27me3 | 7 | + |
## | chr1 | 9953 | 10152 | H3K27me3 | 5 | + |
## | chr1 | 10024 | 10223 | H3K27me3 | 1 | + |
## | chr1 | 10246 | 10445 | H3K27me3 | 4 | + |
## | ... | ... | ... | ... | ... | ... |
## | chr1 | 9978 | 10177 | H3K27me3 | 7 | - |
## | chr1 | 10001 | 10200 | H3K27me3 | 5 | - |
## | chr1 | 10127 | 10326 | H3K27me3 | 1 | - |
## | chr1 | 10241 | 10440 | H3K27me3 | 6 | - |
## +--------------+-----------+-----------+------------+-----------+--------------+
## Stranded PyRanges object has 11 rows and 6 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.
print(gr.to_rle())
## chr1 +
## --
## +--------+--------+------+------+-------+-------+---------+-------+
## | Runs | 9939 | 14 | 71 | ... | 199 | 99801 | 199 |
## |--------+--------+------+------+-------+-------+---------+-------|
## | Values | 0.0 | 1.0 | 2.0 | ... | 1.0 | 0.0 | 1.0 |
## +--------+--------+------+------+-------+-------+---------+-------+
## Rle of length 110445 containing 10 elements (avg. length 11044.5)
##
## chr1 -
## --
## +--------+--------+------+------+------+-------+------+------+------+-------+
## | Runs | 9916 | 35 | 27 | 23 | ... | 23 | 41 | 85 | 114 |
## |--------+--------+------+------+------+-------+------+------+------+-------|
## | Values | 0.0 | 1.0 | 2.0 | 3.0 | ... | 2.0 | 1.0 | 2.0 | 1.0 |
## +--------+--------+------+------+------+-------+------+------+------+-------+
## Rle of length 10440 containing 12 elements (avg. length 870.0)
## RleDict object with 2 chromosomes/strand pairs.
print(gr.to_rle(strand=True))
## chr1 +
## --
## +--------+--------+------+------+-------+-------+---------+-------+
## | Runs | 9939 | 14 | 71 | ... | 199 | 99801 | 199 |
## |--------+--------+------+------+-------+-------+---------+-------|
## | Values | 0.0 | 1.0 | 2.0 | ... | 1.0 | 0.0 | 1.0 |
## +--------+--------+------+------+-------+-------+---------+-------+
## Rle of length 110445 containing 10 elements (avg. length 11044.5)
##
## chr1 -
## --
## +--------+--------+------+------+------+-------+------+------+------+-------+
## | Runs | 9916 | 35 | 27 | 23 | ... | 23 | 41 | 85 | 114 |
## |--------+--------+------+------+------+-------+------+------+------+-------|
## | Values | 0.0 | 1.0 | 2.0 | 3.0 | ... | 2.0 | 1.0 | 2.0 | 1.0 |
## +--------+--------+------+------+------+-------+------+------+------+-------+
## Rle of length 10440 containing 12 elements (avg. length 870.0)
## RleDict object with 2 chromosomes/strand pairs.
print(gr.to_rle(strand=True, rpm=True))
## chr1 +
## --
## +--------+--------+-------------------+-------+---------+-------------------+
## | Runs | 9939 | 14 | ... | 99801 | 199 |
## |--------+--------+-------------------+-------+---------+-------------------|
## | Values | 0.0 | 90909.09090909091 | ... | 0.0 | 90909.09090909091 |
## +--------+--------+-------------------+-------+---------+-------------------+
## Rle of length 110445 containing 10 elements (avg. length 11044.5)
##
## chr1 -
## --
## +--------+--------+-------+-------------------+
## | Runs | 9916 | ... | 114 |
## |--------+--------+-------+-------------------|
## | Values | 0.0 | ... | 90909.09090909091 |
## +--------+--------+-------+-------------------+
## Rle of length 10440 containing 12 elements (avg. length 870.0)
## RleDict object with 2 chromosomes/strand pairs.
To get the RPM-normalized coverage, use the rpm argument.
You can also create coverage for an any numeric value in your PyRanges:
print(gr.to_rle("Score"))
## chr1 +
## --
## +--------+--------+------+------+-------+-------+---------+-------+
## | Runs | 9939 | 14 | 71 | ... | 199 | 99801 | 199 |
## |--------+--------+------+------+-------+-------+---------+-------|
## | Values | 0.0 | 7.0 | 12.0 | ... | 4.0 | 0.0 | 1.0 |
## +--------+--------+------+------+-------+-------+---------+-------+
## Rle of length 110445 containing 10 elements (avg. length 11044.5)
##
## chr1 -
## --
## +--------+--------+------+------+------+-------+------+------+------+-------+
## | Runs | 9916 | 35 | 27 | 23 | ... | 23 | 41 | 85 | 114 |
## |--------+--------+------+------+------+-------+------+------+------+-------|
## | Values | 0.0 | 5.0 | 13.0 | 20.0 | ... | 6.0 | 1.0 | 7.0 | 6.0 |
## +--------+--------+------+------+------+-------+------+------+------+-------+
## Rle of length 10440 containing 12 elements (avg. length 870.0)
## RleDict object with 2 chromosomes/strand pairs.