skbio.stats.distance.DissimilarityMatrix.between

DissimilarityMatrix.between(from_, to_, allow_overlap=False)[source]

Obtain the distances between the two groups of IDs

State: Experimental as of 0.5.5.

Parameters
  • from_ (Iterable of str) – The IDs to obtain distances from. Distances from all pairs of IDs in from and to will be obtained.

  • to_ (Iterable of str) – The IDs to obtain distances to. Distances from all pairs of IDs in to and from will be obtained.

  • allow_overlap (bool, optional) – If True, allow overlap in the IDs of from and to (which would in effect be collecting the within distances). Default is False.

Returns

(i, j, value) representing the source ID (“i”), the target ID (“j”) and the distance (“value”).

Return type

pd.DataFrame

Raises

MissingIDError – If an ID(s) specified is not in the dissimilarity matrix.

Notes

Order of the return items is stable, meaning that requesting IDs [‘a’, ‘b’] is equivalent to [‘b’, ‘a’]. The order is with respect to the .ids attribute of self.

Example

>>> from skbio.stats.distance import DissimilarityMatrix
>>> dm = DissimilarityMatrix([[0, 1, 2, 3, 4], [1, 0, 1, 2, 3],
...                           [2, 1, 0, 1, 2], [3, 2, 1, 0, 1],
...                           [4, 3, 2, 1, 0]],
...                          ['A', 'B', 'C', 'D', 'E'])
>>> dm.between(['A', 'B'], ['C', 'D', 'E'])
   i  j  value
0  A  C    2.0
1  A  D    3.0
2  A  E    4.0
3  B  C    1.0
4  B  D    2.0
5  B  E    3.0