5 Methods for manipulating single PyRanges

There are several methods for manipulating the contents of a PyRanges.

cluster is a mathematical set operation which creates a union of all the intervals in the ranges:

f1 = pr.load_dataset("f1")
print(f1.cluster())
## +--------------+-----------+-----------+
## | Chromosome   |     Start |       End |
## | (category)   |   (int64) |   (int64) |
## |--------------+-----------+-----------|
## | chr1         |         3 |         7 |
## | chr1         |         8 |         9 |
## +--------------+-----------+-----------+
## PyRanges object has 2 sequences from 1 chromosomes.

tssify finds the starts of the regions (taking direction of transcription into account). It is named -ify to make clear that it is not finding the actual tsses (which requires metadata that signifies which intervals represent transcripts).

f1.tssify()
print(f1.tssify(slack=5))
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome   |     Start |       End | Name       |     Score | Strand       |
## | (category)   |   (int64) |   (int64) | (object)   |   (int64) | (category)   |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1         |         0 |         9 | interval1  |         0 | +            |
## | chr1         |         2 |        13 | interval2  |         0 | -            |
## | chr1         |         3 |        14 | interval3  |         0 | +            |
## +--------------+-----------+-----------+------------+-----------+--------------+
## PyRanges object has 3 sequences from 1 chromosomes.

tesify finds the ends of the regions (taking direction of transcription into account).

f1.tesify()
print(f1.tesify(slack=5))
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome   |     Start |       End | Name       |     Score | Strand       |
## | (category)   |   (int64) |   (int64) | (object)   |   (int64) | (category)   |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1         |         1 |        12 | interval1  |         0 | +            |
## | chr1         |         2 |        13 | interval2  |         0 | -            |
## | chr1         |         4 |        15 | interval3  |         0 | +            |
## +--------------+-----------+-----------+------------+-----------+--------------+
## PyRanges object has 3 sequences from 1 chromosomes.

slack extends the starts and ends of your interval.

print(f1.slack(5))
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome   |     Start |       End | Name       |     Score | Strand       |
## | (category)   |   (int64) |   (int64) | (object)   |   (int64) | (category)   |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1         |         0 |        11 | interval1  |         0 | +            |
## | chr1         |         0 |        12 | interval2  |         0 | -            |
## | chr1         |         3 |        14 | interval3  |         0 | +            |
## +--------------+-----------+-----------+------------+-----------+--------------+
## PyRanges object has 3 sequences from 1 chromosomes.