14 Intersecting Ranges
PyRanges objects can be intersected with other PyRanges to find the subset of the genome that is contained in both. The regular intersect-method finds the intersection of all combinations of ranges: 1
import pyranges as pr
= pr.data.aorta()
gr = pr.data.aorta2()
gr2 print(gr.intersect(gr2))
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int32) | (int32) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 9988 | 10138 | H3K27me3 | 7 | + |
## | chr1 | 10073 | 10138 | H3K27me3 | 7 | + |
## | chr1 | 10079 | 10138 | H3K27me3 | 7 | + |
## | chr1 | 10082 | 10138 | H3K27me3 | 7 | + |
## | ... | ... | ... | ... | ... | ... |
## | chr1 | 10241 | 10278 | H3K27me3 | 6 | - |
## | chr1 | 10241 | 10281 | H3K27me3 | 6 | - |
## | chr1 | 10241 | 10348 | H3K27me3 | 6 | - |
## | chr1 | 10280 | 10440 | H3K27me3 | 6 | - |
## +--------------+-----------+-----------+------------+-----------+--------------+
## Stranded PyRanges object has 49 rows and 6 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.
The set_intersect method merges the intervals before finding the intersect: 2
print(gr.set_intersect(gr2))
## +--------------+-----------+-----------+
## | Chromosome | Start | End |
## | (category) | (int32) | (int32) |
## |--------------+-----------+-----------|
## | chr1 | 9988 | 10445 |
## +--------------+-----------+-----------+
## Unstranded PyRanges object has 1 rows and 3 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome.
Both methods also take a strandedness option, which can either be "same"
, "opposite"
or False
/None
print(gr.set_intersect(gr2, strandedness="opposite"))
## +--------------+-----------+-----------+--------------+
## | Chromosome | Start | End | Strand |
## | (category) | (int32) | (int32) | (category) |
## |--------------+-----------+-----------+--------------|
## | chr1 | 9988 | 10223 | + |
## | chr1 | 10246 | 10348 | + |
## | chr1 | 10073 | 10272 | - |
## | chr1 | 10280 | 10440 | - |
## +--------------+-----------+-----------+--------------+
## Stranded PyRanges object has 4 rows and 4 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.
The intersect method also takes a how argument, which currently accepts the
option "containment"
, "first"
or "last"
. The former gives you the
intervals in self be completely within the intervals in other, while first and
last gives you the first and last overlap, respectively.
= pr.data.f1()
f1 print(f1)
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int32) | (int32) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 3 | 6 | interval1 | 0 | + |
## | chr1 | 8 | 9 | interval3 | 0 | + |
## | chr1 | 5 | 7 | interval2 | 0 | - |
## +--------------+-----------+-----------+------------+-----------+--------------+
## Stranded PyRanges object has 3 rows and 6 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.
= pr.data.f2()
f2 print(f2)
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int32) | (int32) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 1 | 2 | a | 0 | + |
## | chr1 | 6 | 7 | b | 0 | - |
## +--------------+-----------+-----------+------------+-----------+--------------+
## Stranded PyRanges object has 2 rows and 6 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.
= f2.intersect(f1, how="containment")
result print(result)
## +--------------+-----------+-----------+------------+-----------+--------------+
## | Chromosome | Start | End | Name | Score | Strand |
## | (category) | (int32) | (int32) | (object) | (int64) | (category) |
## |--------------+-----------+-----------+------------+-----------+--------------|
## | chr1 | 6 | 7 | b | 0 | - |
## +--------------+-----------+-----------+------------+-----------+--------------+
## Stranded PyRanges object has 1 rows and 6 columns from 1 chromosomes.
## For printing, the PyRanges was sorted on Chromosome and Strand.