skbio.sequence.
GrammaredSequence
(sequence, metadata=None, positional_metadata=None, interval_metadata=None, lowercase=False, validate=True)[source]¶Store sequence data conforming to a character set.
This is an abstract base class (ABC) that cannot be instantiated.
This class is intended to be inherited from to create grammared sequences with custom alphabets.
ValueError – If sequence characters are not in the character set 1.
References
Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. May 10, 1985; 13(9): 3021-3030. A Cornish-Bowden
Examples
Note in the example below that properties either need to be static or use skbio’s classproperty decorator.
>>> from skbio.sequence import GrammaredSequence
>>> from skbio.util import classproperty
>>> class CustomSequence(GrammaredSequence):
... @classproperty
... def degenerate_map(cls):
... return {"X": set("AB")}
...
... @classproperty
... def definite_chars(cls):
... return set("ABC")
...
...
... @classproperty
... def default_gap_char(cls):
... return '-'
...
... @classproperty
... def gap_chars(cls):
... return set('-.')
>>> seq = CustomSequence('ABABACAC')
>>> seq
CustomSequence
--------------------------
Stats:
length: 8
has gaps: False
has degenerates: False
has definites: True
--------------------------
0 ABABACAC
>>> seq = CustomSequence('XXXXXX')
>>> seq
CustomSequence
-------------------------
Stats:
length: 6
has gaps: False
has degenerates: True
has definites: False
-------------------------
0 XXXXXX
Attributes
|
Return valid characters. |
|
Gap character to use when constructing a new gapped sequence. |
|
|
|
Return definite characters. |
|
Return degenerate characters. |
|
Return mapping of degenerate to definite characters. |
|
Return characters defined as gaps. |
|
|
|
|
|
Return non-degenerate characters. |
|
Set of observed characters in the sequence. |
|
|
|
Array containing underlying sequence characters. |
Built-ins
Returns truth value (truthiness) of sequence. |
|
Determine if a subsequence is contained in this sequence. |
|
Return a shallow copy of this sequence. |
|
Return a deep copy of this sequence. |
|
Determine if this sequence is equal to another. |
|
Slice this sequence. |
|
Iterate over positions in this sequence. |
|
Return the number of characters in this sequence. |
|
Determine if this sequence is not equal to another. |
|
Iterate over positions in this sequence in reverse order. |
|
Return sequence characters as a string. |
Methods
|
Concatenate an iterable of |
|
Count occurrences of a subsequence in this sequence. |
Find positions containing definite characters in the sequence. |
|
|
Return a new sequence with gap characters removed. |
Find positions containing degenerate characters in the sequence. |
|
|
Compute the distance to another sequence. |
Yield all possible definite versions of the sequence. |
|
|
Search the biological sequence for motifs. |
|
Generate slices for patterns matched by a regular expression. |
|
Compute frequencies of characters in the sequence. |
|
Find positions containing gaps in the biological sequence. |
Determine if sequence contains one or more definite characters |
|
Determine if sequence contains one or more degenerate characters. |
|
|
Determine if the sequence contains one or more gap characters. |
Determine if the object has interval metadata. |
|
Determine if the object has metadata. |
|
Determine if sequence contains one or more non-degenerate characters |
|
Determine if the object has positional metadata. |
|
|
Find position where subsequence first occurs in the sequence. |
|
Yield contiguous subsequences based on included. |
|
Generate kmers of length k from this sequence. |
|
Return counts of words of length k from this sequence. |
|
Return a case-sensitive string representation of the sequence. |
|
Return count of positions that are the same between two sequences. |
|
Find positions that match with another sequence. |
|
Return count of positions that differ between two sequences. |
|
Find positions that do not match with another sequence. |
Find positions containing non-degenerate characters in the sequence. |
|
|
Create a new |
|
Replace values in this sequence with a different character. |
|
Return regular expression object that accounts for degenerate chars. |
|
Write an instance of |