libsequence  1.9.5
Alphabets defined in libsequence

Various character sets for different types of analysis. More...

Classes

struct  Sequence::ambiguousNucleotide
 Tests if a character is in the set A,G,C,T. More...
 
struct  Sequence::invalidPolyChar
 This functor can be used to determine if a range contains characters that the SNP analysis routines in this library cannot handle gracefully. More...
 

Typedefs

using Sequence::alphabet_t = std::array< const char, 16 >
 Container type for nucleotide alphabets.
 

Functions

bool Sequence::isDNA (const char &ch)
 test if character is part of Sequence::dna_alphabet More...
 

Variables

const alphabet_t Sequence::dna_alphabet
 Alphabet for DNA sequences Valid DNA characters. Upper-case only. Only - is accepted as gap characters. More...
 
const alphabet_t Sequence::dna_poly_alphabet
 Alphabet for polymorphism (SNP) analysis. 16 characters are used so that we may encode 2 nucleotides in a 8-bit integer. More...
 
const alphabet_t::size_type Sequence::NOTPOLYCHAR = dna_poly_alphabet.size()
 An index from dna_poly_alphabet >= this is not a valid character for variation analysis.
 
const alphabet_t::size_type Sequence::POLYEOS
 The value of terminating an encoded string of SNP data. More...
 

Detailed Description

Various character sets for different types of analysis.

Function Documentation

◆ isDNA()

bool Sequence::isDNA ( const char &  ch)

test if character is part of Sequence::dna_alphabet

Parameters
chCharacter to test
Returns
true if ch is in Sequence::dna_alphabet, false otherwise
Note
case-insensitive via std::toupper

Definition at line 25 of file SeqAlphabets.cc.

Variable Documentation

◆ dna_alphabet

const alphabet_t Sequence::dna_alphabet
Initial value:
{ {'A','C','G','T',
'R','Y','S','W',
'K','M','B','D',
'H','V','N','-'} }

Alphabet for DNA sequences Valid DNA characters. Upper-case only. Only - is accepted as gap characters.

Note
http://www.bioinformatics.org/sms/iupac.html, excluding U, and ., which is redundant with -

Definition at line 8 of file SeqAlphabets.cc.

◆ dna_poly_alphabet

const alphabet_t Sequence::dna_poly_alphabet
Initial value:
{ {'A','C','G','T',
'0','1','-','N',
'\0',
} }

Alphabet for polymorphism (SNP) analysis. 16 characters are used so that we may encode 2 nucleotides in a 8-bit integer.

Definition at line 13 of file SeqAlphabets.cc.

◆ POLYEOS

const alphabet_t::size_type Sequence::POLYEOS
Initial value:
= alphabet_t::size_type( std::distance(dna_poly_alphabet.begin(),
std::find(dna_poly_alphabet.begin(),
'\0')
) )
const alphabet_t dna_poly_alphabet
Alphabet for polymorphism (SNP) analysis. 16 characters are used so that we may encode 2 nucleotides ...
Definition: SeqAlphabets.cc:13

The value of terminating an encoded string of SNP data.

Definition at line 20 of file SeqAlphabets.cc.