TabularMSA.
iter_positions
(reverse=False, ignore_metadata=False)[source]¶Iterate over positions (columns) in the MSA.
State: Experimental as of 0.4.1.
reverse (bool, optional) – If True
, iterate over positions in reverse order.
ignore_metadata (bool, optional) – If True
, Sequence.metadata
and
Sequence.positional_metadata
will not be included. This can
significantly improve performance if metadata is not needed.
Sequence – Each position in the order they are stored in the MSA.
Notes
Each position will be yielded as exactly a Sequence
object,
regardless of this MSA’s dtype
. Sequence
is used because a
position is an artifact of multiple sequence alignment and is not a
real biological sequence.
Each Sequence
object will have its corresponding MSA positional
metadata stored as metadata
unless ignore_metadata
is set to
True
.
Sequences will have their positional metadata concatenated using an
outer join unless ignore_metadata
is set to True
. See
Sequence.concat(how='outer')
for details.
Examples
Create an MSA with positional metadata:
>>> from skbio import DNA, TabularMSA
>>> sequences = [DNA('ACG'),
... DNA('A-T')]
>>> msa = TabularMSA(sequences,
... positional_metadata={'prob': [3, 1, 2]})
Iterate over positions:
>>> for position in msa.iter_positions():
... position
... print()
Sequence
-------------
Metadata:
'prob': 3
Stats:
length: 2
-------------
0 AA
Sequence
-------------
Metadata:
'prob': 1
Stats:
length: 2
-------------
0 C-
Sequence
-------------
Metadata:
'prob': 2
Stats:
length: 2
-------------
0 GT
Note that MSA positional metadata is stored as metadata
on each
Sequence
object.
Iterate over positions in reverse order:
>>> for position in msa.iter_positions(reverse=True):
... position
... print('')
Sequence
-------------
Metadata:
'prob': 2
Stats:
length: 2
-------------
0 GT
Sequence
-------------
Metadata:
'prob': 1
Stats:
length: 2
-------------
0 C-
Sequence
-------------
Metadata:
'prob': 3
Stats:
length: 2
-------------
0 AA