DNA.
kmer_frequencies
(k, overlap=True, relative=False)[source]¶Return counts of words of length k from this sequence.
State: Stable as of 0.4.0.
k (int) – The word length.
overlap (bool, optional) – Defines whether the kmers should be overlapping or not.
relative (bool, optional) – If True
, return the relative frequency of each kmer instead of
its count.
Frequencies of words of length k contained in this sequence.
dict
ValueError – If k is less than 1.
Examples
>>> from pprint import pprint
>>> from skbio import Sequence
>>> s = Sequence('ACACATTTATTA')
>>> freqs = s.kmer_frequencies(3, overlap=False)
>>> pprint(freqs) # using pprint to display dict in sorted order
{'ACA': 1, 'CAT': 1, 'TTA': 2}
>>> freqs = s.kmer_frequencies(3, relative=True, overlap=False)
>>> pprint(freqs)
{'ACA': 0.25, 'CAT': 0.25, 'TTA': 0.5}