portion, a Python library providing data structure and operations for intervals.
The
portionlibrary (formerly distributed as
python-intervals) provides data structure and operations for intervals in Python 3.6+.
Latest release: -
portion: 2.2.0 on 2021-09-14 (documentation, changes). -
python-intervals: 1.10.0 on 2019-09-26 (documentation, changes).
Note that
python-intervalswill no longer receive updates since it has been replaced by
portion.
You can use
pipto install it, as usual:
pip install portion. This will install the latest available version from PyPI. Pre-releases are available from the master branch on GitHub and can be installed with
pip install git+https://github.com/AlexandreDecan/portion. Note that
portionis also available on conda-forge.
You can install
portionand its development environment using
pip install -e .[test]at the root of this repository. This automatically installs pytest (for the test suites) and black (for code formatting).
Assuming this library is imported using
import portion as P, intervals can be easily created using one of the following helpers:
>>> P.open(1, 2) (1,2) >>> P.closed(1, 2) [1,2] >>> P.openclosed(1, 2) (1,2] >>> P.closedopen(1, 2) [1,2) >>> P.singleton(1) [1] >>> P.empty() ()
The bounds of an interval can be any arbitrary values, as long as they are comparable:
>>> P.closed(1.2, 2.4) [1.2,2.4] >>> P.closed('a', 'z') ['a','z'] >>> import datetime >>> P.closed(datetime.date(2011, 3, 15), datetime.date(2013, 10, 10)) [datetime.date(2011, 3, 15),datetime.date(2013, 10, 10)]
Infinite and semi-infinite intervals are supported using
P.infand
-P.infas upper or lower bounds. These two objects support comparison with any other object. When infinities are used as a lower or upper bound, the corresponding boundary is automatically converted to an open one.
>>> P.inf > 'a', P.inf > 0, P.inf > True (True, True, True) >>> P.openclosed(-P.inf, 0) (-inf,0] >>> P.closed(-P.inf, P.inf) # Automatically converted to an open interval (-inf,+inf)
Empty intervals always resolve to
(P.inf, -P.inf), regardless of the provided bounds:
>>> P.empty() == P.open(P.inf, -P.inf) True >>> P.closed(4, 3) == P.open(P.inf, -P.inf) True >>> P.openclosed('a', 'a') == P.open(P.inf, -P.inf) True
Intervals created with this library are
Intervalinstances. An
Intervalinstance is a disjunction of atomic intervals each representing a single interval (e.g.
[1,2]). Intervals can be iterated to access the underlying atomic intervals, sorted by their lower and upper bounds.
>>> list(P.open(10, 11) | P.closed(0, 1) | P.closed(20, 21)) [[0,1], (10,11), [20,21]]
Nested intervals can also be retrieved with a position or a slice:
>>> (P.open(10, 11) | P.closed(0, 1) | P.closed(20, 21))[0] [0,1] >>> (P.open(10, 11) | P.closed(0, 1) | P.closed(20, 21))[-2] (10,11) >>> (P.open(10, 11) | P.closed(0, 1) | P.closed(20, 21))[:2] [0,1] | (10,11)
For convenience, intervals are automatically simplified:
>>> P.closed(0, 2) | P.closed(2, 4) [0,4] >>> P.closed(1, 2) | P.closed(3, 4) | P.closed(2, 3) [1,4] >>> P.empty() | P.closed(0, 1) [0,1] >>> P.closed(1, 2) | P.closed(2, 3) | P.closed(4, 5) [1,3] | [4,5]
Note that simplification of discrete intervals is not supported by
portion(but it can be simulated though, see #24). For example, combining
[0,1]with
[2,3]will not result in
[0,3]even if there is no integer between
1and
2.
An
Intervaldefines the following properties:
i.emptyis
Trueif and only if the interval is empty. ```python >>> P.closed(0, 1).empty False >>> P.closed(0, 0).empty False >>> P.openclosed(0, 0).empty True >>> P.empty().empty True
- `i.atomic` is `True` if and only if the interval is a disjunction of a single (possibly empty) interval. ```python >>> P.closed(0, 2).atomic True >>> (P.closed(0, 1) | P.closed(1, 2)).atomic True >>> (P.closed(0, 1) | P.closed(2, 3)).atomic False
i.enclosurerefers to the smallest atomic interval that includes the current one. ```python >>> (P.closed(0, 1) | P.open(2, 3)).enclosure [0,3)
The left and right boundaries, and the lower and upper bounds of an interval can be respectively accessed with its `left`, `right`, `lower` and `upper` attributes. The `left` and `right` bounds are either `P.CLOSED` or `P.OPEN`. By definition, `P.CLOSED == ~P.OPEN` and vice-versa.```python >> P.CLOSED, P.OPEN CLOSED, OPEN >>> x = P.closedopen(0, 1) >>> x.left, x.lower, x.upper, x.right (CLOSED, 0, 1, OPEN)
If the interval is not atomic, then
leftand
lowerrefer to the lower bound of its enclosure, while
rightand
upperrefer to the upper bound of its enclosure:
>>> x = P.open(0, 1) | P.closed(3, 4) >>> x.left, x.lower, x.upper, x.right (OPEN, 0, 4, CLOSED)
One can easily check for some interval properties based on the bounds of an interval:
>>> x = P.openclosed(-P.inf, 0) >>> # Check that interval is left/right closed >>> x.left == P.CLOSED, x.right == P.CLOSED (False, True) >>> # Check that interval is left/right bounded >>> x.lower == -P.inf, x.upper == P.inf (True, False) >>> # Check for singleton >>> x.lower == x.upper False
Intervalinstances support the following operations:
i.intersection(other)and
i & otherreturn the intersection of two intervals. ```python >>> P.closed(0, 2) & P.closed(1, 3) [1,2] >>> P.closed(0, 4) & P.open(2, 3) (2,3) >>> P.closed(0, 2) & P.closed(2, 3) [2] >>> P.closed(0, 2) & P.closed(3, 4) ()
- `i.union(other)` and `i | other` return the union of two intervals. ```python >>> P.closed(0, 1) | P.closed(1, 2) [0,2] >>> P.closed(0, 1) | P.closed(2, 3) [0,1] | [2,3]
i.complement(other)and
~ireturn the complement of the interval. ```python >>> ~P.closed(0, 1) (-inf,0) | (1,+inf) >>> ~(P.open(-P.inf, 0) | P.open(1, P.inf)) [0,1] >>> ~P.open(-P.inf, P.inf) ()
- `i.difference(other)` and `i - other` return the difference between `i` and `other`. ```python >>> P.closed(0,2) - P.closed(1,2) [0,1) >>> P.closed(0, 4) - P.closed(1, 2) [0,1) | (2,4]
i.contains(other)and
other in ihold if given item is contained in the interval. It supports intervals and arbitrary comparable values. ```python >>> 2 in P.closed(0, 2) True >>> 2 in P.open(0, 2) False >>> P.open(0, 1) in P.closed(0, 2) True
- `i.adjacent(other)` tests if the two intervals are adjacent, i.e., if they do not overlap and their union form a single atomic interval. While this definition corresponds to the usual notion of adjacency for atomic intervals, it has stronger requirements for non-atomic ones since it requires all underlying atomic intervals to be adjacent (i.e. that one interval fills the gaps between the atomic intervals of the other one). ```python >>> P.closed(0, 1).adjacent(P.openclosed(1, 2)) True >>> P.closed(0, 1).adjacent(P.closed(1, 2)) False >>> (P.closed(0, 1) | P.closed(2, 3)).adjacent(P.open(1, 2) | P.open(3, 4)) True >>> (P.closed(0, 1) | P.closed(2, 3)).adjacent(P.open(3, 4)) False >>> P.closed(0, 1).adjacent(P.open(1, 2) | P.open(3, 4)) False
i.overlaps(other)tests if there is an overlap between two intervals. ```python >>> P.closed(1, 2).overlaps(P.closed(2, 3)) True >>> P.closed(1, 2).overlaps(P.open(2, 3)) False
Finally, intervals are hashable as long as their bounds are hashable (and we have defined a hash value for `P.inf` and `-P.inf`).Comparison operators
Equality between intervals can be checked with the classical
==
operator:```python >>> P.closed(0, 2) == P.closed(0, 1) | P.closed(1, 2) True >>> P.closed(0, 2) == P.open(0, 2) False
Moreover, intervals are comparable using
>,
>=,
<or
<=. These comparison operators have a different behaviour than the usual ones. For instance,
a < bholds if
ais entirely on the left of the lower bound of
band
a > bholds if
ais entirely on the right of the upper bound of
b.
>>> P.closed(0, 1) < P.closed(2, 3) True >>> P.closed(0, 1) < P.closed(1, 2) False
Similarly,
a <= bholds if
ais entirely on the left of the upper bound of
b, and
a >= bholds if
ais entirely on the right of the lower bound of
b.
>>> P.closed(0, 1) <= P.closed(2, 3) True >>> P.closed(0, 2) <= P.closed(1, 3) True >>> P.closed(0, 3) <= P.closed(1, 2) False
Intervals can also be compared with single values. If
iis an interval and
xa value, then
x < iholds if
xis on the left of the lower bound of
iand
x <= iholds if
xis on the left of the upper bound of
i.
>>> 5 < P.closed(0, 10) False >>> 5 <= P.closed(0, 10) True >>> P.closed(0, 10) < 5 False >>> P.closed(0, 10) <= 5 True
This behaviour is similar to the one that could be obtained by first converting
xto a singleton interval (except for infinities since they resolve to empty intervals).
Note that all these semantics differ from classical comparison operators. As a consequence, some intervals are never comparable in the classical sense, as illustrated hereafter:
>>> P.closed(0, 4) <= P.closed(1, 2) or P.closed(0, 4) >= P.closed(1, 2) False >>> P.closed(0, 4) < P.closed(1, 2) or P.closed(0, 4) > P.closed(1, 2) False >>> P.empty() < P.empty() True >>> P.empty() < P.closed(0, 1) and P.empty() > P.closed(0, 1) True
Intervals are immutable but provide a
replacemethod to create a new interval based on the current one. This method accepts four optional parameters
left,
lower,
upper, and
right:
>>> i = P.closed(0, 2) >>> i.replace(P.OPEN, -1, 3, P.CLOSED) (-1,3] >>> i.replace(lower=1, right=P.OPEN) [1,2)
Functions can be passed instead of values. If a function is passed, it is called with the current corresponding value.
>>> P.closed(0, 2).replace(upper=lambda x: 2 * x) [0,4]
The provided function won't be called on infinities, unless
ignore_infis set to
False.
>>> i = P.closedopen(0, P.inf) >>> i.replace(upper=lambda x: 10) # No change, infinity is ignored [0,+inf) >>> i.replace(upper=lambda x: 10, ignore_inf=False) # Infinity is not ignored [0,10)
When
replaceis applied on an interval that is not atomic, it is extended and/or restricted such that its enclosure satisfies the new bounds.
>>> i = P.openclosed(0, 1) | P.closed(5, 10) >>> i.replace(P.CLOSED, -1, 8, P.OPEN) [-1,1] | [5,8) >>> i.replace(lower=4) (4,10]
To apply arbitrary transformations on the underlying atomic intervals, intervals expose an
applymethod that acts like
map. This method accepts a function that will be applied on each of the underlying atomic intervals to perform the desired transformation. The provided function is expected to return either an
Interval, or a 4-uple
(left, lower, upper, right).
>>> i = P.closed(2, 3) | P.open(4, 5) >>> # Increment bound values >>> i.apply(lambda x: (x.left, x.lower + 1, x.upper + 1, x.right)) [3,4] | (5,6) >>> # Invert bounds >>> i.apply(lambda x: (~x.left, x.lower, x.upper, ~x.right)) (2,3) | [4,5]
The
applymethod is very powerful when used in combination with
replace. Because the latter allows functions to be passed as parameters and ignores infinities by default, it can be conveniently used to transform (disjunction of) intervals in presence of infinities.
>>> i = P.openclosed(-P.inf, 0) | P.closed(3, 4) | P.closedopen(8, P.inf) >>> # Increment bound values >>> i.apply(lambda x: x.replace(upper=lambda v: v + 1)) (-inf,1] | [3,5] | [8,+inf) >>> # Intervals are still automatically simplified >>> i.apply(lambda x: x.replace(lower=lambda v: v * 2)) (-inf,0] | [16,+inf) >>> # Invert bounds >>> i.apply(lambda x: x.replace(left=lambda v: ~v, right=lambda v: ~v)) (-inf,0) | (3,4) | (8,+inf) >>> # Replace infinities with -10 and 10 >>> conv = lambda v: -10 if v == -P.inf else (10 if v == P.inf else v) >>> i.apply(lambda x: x.replace(lower=conv, upper=conv, ignore_inf=False)) (-10,0] | [3,4] | [8,10)
The
iteratefunction takes an interval, and returns a generator to iterate over the values of an interval. Obviously, as intervals are continuous, it is required to specify the
stepbetween consecutive values. The iteration then starts from the lower bound and ends on the upper one. Only values contained by the interval are returned this way.
>>> list(P.iterate(P.closed(0, 3), step=1)) [0, 1, 2, 3] >>> list(P.iterate(P.closed(0, 3), step=2)) [0, 2] >>> list(P.iterate(P.open(0, 3), step=2)) [2]
When an interval is not atomic,
iterateconsecutively iterates on all underlying atomic intervals, starting from each lower bound and ending on each upper one:
>>> list(P.iterate(P.singleton(0) | P.singleton(3) | P.singleton(5), step=2)) # Won't be [0] [0, 3, 5] >>> list(P.iterate(P.closed(0, 2) | P.closed(4, 6), step=3)) # Won't be [0, 6] [0, 4]
By default, the iteration always starts on the lower bound of each underlying atomic interval. The
baseparameter can be used to change this behaviour, by specifying how the initial value to start the iteration from must be computed. This parameter accepts a callable that is called with the lower bound of each underlying atomic interval, and that returns the initial value to start the iteration from. It can be helpful to deal with (semi-)infinite intervals, or to align the generated values of the iterator:
>>> # Align on integers >>> list(P.iterate(P.closed(0.3, 4.9), step=1, base=int)) [1, 2, 3, 4] >>> # Restrict values of a (semi-)infinite interval >>> list(P.iterate(P.openclosed(-P.inf, 2), step=1, base=lambda x: max(0, x))) [0, 1, 2]
The
baseparameter can be used to change how
iterateapplies on unions of atomic interval, by specifying a function that returns a single value, as illustrated next:
>>> base = lambda x: 0 >>> list(P.iterate(P.singleton(0) | P.singleton(3) | P.singleton(5), step=2, base=base)) [0] >>> list(P.iterate(P.closed(0, 2) | P.closed(4, 6), step=3, base=base)) [0, 6]
Notice that defining
basesuch that it returns a single value can be extremely inefficient in terms of performance when the intervals are "far apart" each other (i.e., when the gaps between atomic intervals are large).
Finally, iteration can be performed in reverse order by specifying
reverse=True.
>>> list(P.iterate(P.closed(0, 3), step=-1, reverse=True)) # Mind step=-1 [3, 2, 1, 0] >>> list(P.iterate(P.closed(0, 3), step=-2, reverse=True)) # Mind step=-2 [3, 1]
Again, this library does not make any assumption about the objects being used in an interval, as long as they are comparable. However, it is not always possible to provide a meaningful value for
step(e.g., what would be the step between two consecutive characters?). In these cases, a callable can be passed instead of a value. This callable will be called with the current value, and is expected to return the next possible value.
>>> list(P.iterate(P.closed('a', 'd'), step=lambda d: chr(ord(d) + 1))) ['a', 'b', 'c', 'd'] >>> # Since we reversed the order, we changed plus to minus in step. >>> list(P.iterate(P.closed('a', 'd'), step=lambda d: chr(ord(d) - 1), reverse=True)) ['d', 'c', 'b', 'a']
The library provides an
IntervalDictclass, a
dict-like data structure to store and query data along with intervals. Any value can be stored in such data structure as long as it supports equality.
>>> d = P.IntervalDict() >>> d[P.closed(0, 3)] = 'banana' >>> d[4] = 'apple' >>> d {[0,3]: 'banana', [4]: 'apple'}
When a value is defined for an interval that overlaps an existing one, it is automatically updated to take the new value into account:
>>> d[P.closed(2, 4)] = 'orange' >>> d {[0,2): 'banana', [2,4]: 'orange'}
An
IntervalDictcan be queried using single values or intervals. If a single value is used as a key, its behaviour corresponds to the one of a classical
dict:
>>> d[2] 'orange' >>> d[5] # Key does not exist Traceback (most recent call last): ... KeyError: 5 >>> d.get(5, default=0) 0
When the key is an interval, a new
IntervalDictcontaining the values for the specified key is returned:
>>> d[~P.empty()] # Get all values, similar to d.copy() {[0,2): 'banana', [2,4]: 'orange'} >>> d[P.closed(1, 3)] {[1,2): 'banana', [2,3]: 'orange'} >>> d[P.closed(-2, 1)] {[0,1]: 'banana'} >>> d[P.closed(-2, -1)] {}
By using
.get, a default value (defaulting to
None) can be specified. This value is used to "fill the gaps" if the queried interval is not completely covered by the
IntervalDict:
>>> d.get(P.closed(-2, 1), default='peach') {[-2,0): 'peach', [0,1]: 'banana'} >>> d.get(P.closed(-2, -1), default='peach') {[-2,-1]: 'peach'} >>> d.get(P.singleton(1), default='peach') # Key is covered, default is not used {[1]: 'banana'}
For convenience, an
IntervalDictprovides a way to look for specific data values. The
.findmethod always returns a (possibly empty)
Intervalinstance for which given value is defined:
>>> d.find('banana') [0,2) >>> d.find('orange') [2,4] >>> d.find('carrot') ()
The active domain of an
IntervalDictcan be retrieved with its
.domainmethod. This method always returns a single
Intervalinstance, where
.keysreturns a sorted view of disjoint intervals.
>>> d.domain() [0,4] >>> list(d.keys()) [[0,2), [2,4]] >>> list(d.values()) ['banana', 'orange'] >>> list(d.items()) [([0,2), 'banana'), ([2,4], 'orange')]
The
.keys,
.valuesand
.itemsmethods return exactly one element for each stored value (i.e., if two intervals share a value, they are merged into a disjunction), as illustrated next. See #44 to know how to obtained a sorted list of atomic intervals instead.
>>> d = P.IntervalDict() >>> d[P.closed(0, 1)] = d[P.closed(2, 3)] = 'peach' >>> list(d.items()) [([0,1] | [2,3], 'peach')]
Two
IntervalDictinstances can be combined together using the
.combinemethod. This method returns a new
IntervalDictwhose keys and values are taken from the two source
IntervalDict. Values corresponding to non-intersecting keys are simply copied, while values corresponding to intersecting keys are combined together using the provided function, as illustrated hereafter:
>>> d1 = P.IntervalDict({P.closed(0, 2): 'banana'}) >>> d2 = P.IntervalDict({P.closed(1, 3): 'orange'}) >>> concat = lambda x, y: x + '/' + y >>> d1.combine(d2, how=concat) {[0,1): 'banana', [1,2]: 'banana/orange', (2,3]: 'orange'}
Resulting keys always correspond to an outer join. Other joins can be easily simulated by querying the resulting
IntervalDictas follows:
>>> d = d1.combine(d2, how=concat) >>> d[d1.domain()] # Left join {[0,1): 'banana', [1,2]: 'banana/orange'} >>> d[d2.domain()] # Right join {[1,2]: 'banana/orange', (2,3]: 'orange'} >>> d[d1.domain() & d2.domain()] # Inner join {[1,2]: 'banana/orange'}
Finally, similarly to a
dict, an
IntervalDictalso supports
len,
inand
del, and defines
.clear,
.copy,
.update,
.pop,
.popitem, and
.setdefault. For convenience, one can export the content of an
IntervalDictto a classical Python
dictusing the
as_dictmethod.
Intervals can be exported to string, either using
repr(as illustrated above) or with the
to_stringfunction.
>>> P.to_string(P.closedopen(0, 1)) '[0,1)'
The way string representations are built can be easily parametrized using the various parameters supported by
to_string:
>>> params = { ... 'disj': ' or ', ... 'sep': ' - ', ... 'left_closed': '', ... 'left_open': '..', ... 'right_open': '..', ... 'pinf': '+oo', ... 'ninf': '-oo', ... 'conv': lambda v: '"{}"'.format(v), ... } >>> x = P.openclosed(0, 1) | P.closed(2, P.inf) >>> P.to_string(x, **params) '.."0" - "1"> orSimilarly, intervals can be created from a string using the
from_stringfunction. A conversion function (convparameter) has to be provided to convert a bound (as string) to a value.>>> P.from_string('[0, 1]', conv=int) == P.closed(0, 1) True >>> P.from_string('[1.2]', conv=float) == P.singleton(1.2) True >>> converter = lambda s: datetime.datetime.strptime(s, '%Y/%m/%d') >>> P.from_string('[2011/03/15, 2013/10/10]', conv=converter) [datetime.datetime(2011, 3, 15, 0, 0),datetime.datetime(2013, 10, 10, 0, 0)]Similarly to
to_string, functionfrom_stringcan be parametrized to deal with more elaborated inputs. Notice that asfrom_stringexpects regular expression patterns, we need to escape some characters.>>> s = '.."0" - "1"> or >> params = { ... 'disj': ' or ', ... 'sep': ' - ', ... 'left_closed': '', ... 'left_open': r'\.\.', # from_string expects regular expression patterns ... 'right_open': r'\.\.', # from_string expects regular expression patterns ... 'pinf': r'\+oo', # from_string expects regular expression patterns ... 'ninf': '-oo', ... 'conv': lambda v: int(v[1:-1]), ... } >>> P.from_string(s, **params) (0,1] | [2,+inf)When a bound contains a comma or has a representation that cannot be automatically parsed with
from_string, theboundparameter can be used to specify the regular expression that should be used to match its representation.>>> s = '[(0, 1), (2, 3)]' # Bounds are expected to be tuples >>> P.from_string(s, conv=eval, bound=r'\(.+?\)') [(0, 1),(2, 3)]Import & export intervals to Python built-in data types
Intervals can also be exported to a list of 4-uples with
to_data, e.g., to support JSON serialization.P.CLOSEDandP.OPENare represented by Boolean valuesTrue(inclusive) andFalse(exclusive).>>> P.to_data(P.openclosed(0, 2)) [(False, 0, 2, True)]The values used to represent positive and negative infinities can be specified with
pinfandninf. They default tofloat('inf')andfloat('-inf')respectively.>>> x = P.openclosed(0, 1) | P.closedopen(2, P.inf) >>> P.to_data(x) [(False, 0, 1, True), (True, 2, inf, False)]The function to convert bounds can be specified with the
convparameter.>>> x = P.closedopen(datetime.date(2011, 3, 15), datetime.date(2013, 10, 10)) >>> P.to_data(x, conv=lambda v: (v.year, v.month, v.day)) [(True, (2011, 3, 15), (2013, 10, 10), False)]Intervals can be imported from such a list of 4-tuples with
from_data. The same set of parameters can be used to specify how bounds and infinities are converted.>>> x = [(True, (2011, 3, 15), (2013, 10, 10), False)] >>> P.from_data(x, conv=lambda v: datetime.date(*v)) [datetime.date(2011, 3, 15),datetime.date(2013, 10, 10))Changelog
This library adheres to a semantic versioning scheme. See CHANGELOG.md for the list of changes.
Contributions
Contributions are very welcome! Feel free to report bugs or suggest new features using GitHub issues and/or pull requests.
License
Distributed under LGPLv3 - GNU Lesser General Public License, version 3.
You can refer to this library using:
@software{portion, author = {Decan, Alexandre}, title = {portion: Python data structure and operations for intervals}, url = {https://github.com/AlexandreDecan/portion}, }