libsequence
1.9.5
|
Polymorphism tables for sequence data. More...
#include <Sequence/PolySites.hpp>
Public Member Functions | |
template<typename __DataType > | |
PolySites (const std::vector< __DataType > &alignment, bool strictInfSites=0, bool ignoregaps=1, bool skipMissing=false, bool skipAdjSNP=false, unsigned freqfilter=0) | |
PolySites (std::vector< double > List, std::vector< std::string > stringList) | |
PolySites (PolyTable::const_site_iterator beg, PolyTable::const_site_iterator end) | |
PolySites (PolySites &&) | |
PolySites (const PolySites &) | |
PolySites & | operator= (PolySites &&) |
PolySites & | operator= (const PolySites &) |
std::istream & | read (std::istream &s) |
std::ostream & | print (std::ostream &stream) const |
output a tab-delimited array of positions and character states More... | |
Polymorphism tables for sequence data.
This is one of the more useful classes in namespace Sequence. Its purpose is to take a bunch of data (a vector<Fasta> in fact), and turn it into a list of variable positions. It doesn't matter whether or not you have an outgroup in you vector (except that if you want to use it for later analysis, it had better be present).
The default behavior of this class is just to play with the std::strings themselves. So what you end up with is a vector of variable sites, stored in PolyData::positions, and a vector ofstd::strings containing the variable sites, stored in PolyData::data. Note that if you include an outgroup in your vector<Fasta>, and it contains a different character than the ingroup at some site, then the site is considered variable.
You can also try and turn the data into a "binary" (i.e. 0 and 1) format, by a call to Sequence::PolySites::Binary.
EXAMPLE:
Here is a common use of a PolySites class. You have a file, "gene.fasta", containing some number of sequences that represent polymorphism data. The file is assumed to be aligned, but we'll check for that, just in case you forgot to run ClustalW or something.
Removing the terminal gaps guarantees that polymorphic site positions are labelled starting from the first ungapped position. Of course, a lot of the extra syntax in the example can be eliminated by giving the following 2 using declarations:
using namespace Sequence;
using namespace Sequence::Alignment;
For a second example, assume the data are in the file "gene.aln", the results of a ClustalW alignment.
Definition at line 33 of file PolySites.hpp.
Sequence::PolySites::PolySites | ( | std::vector< double > | List, |
std::vector< std::string > | stringList | ||
) |
Use this constructor if you already have a list of positions and characters
List | a list of doubles representing positions of polymorphic positions |
stringList | a vector of strings representing the polymorphic characters |
Definition at line 134 of file PolySites.cc.
std::ostream & Sequence::PolySites::print | ( | std::ostream & | stream | ) | const |
output a tab-delimited array of positions and character states
Allows objects of type Sequence::PolySites to be written to output streams. The output is a simple, tab-delimited table of variable site positions and characters
Definition at line 191 of file PolySites.cc.