libsequence  1.9.5

Polymorphism tables for sequence data. More...

#include <Sequence/PolySites.hpp>

Inheritance diagram for Sequence::PolySites:
Sequence::PolyTable

Public Member Functions

template<typename __DataType >
 PolySites (const std::vector< __DataType > &alignment, bool strictInfSites=0, bool ignoregaps=1, bool skipMissing=false, bool skipAdjSNP=false, unsigned freqfilter=0)
 
 PolySites (std::vector< double > List, std::vector< std::string > stringList)
 
 PolySites (PolyTable::const_site_iterator beg, PolyTable::const_site_iterator end)
 
 PolySites (PolySites &&)
 
 PolySites (const PolySites &)
 
PolySitesoperator= (PolySites &&)
 
PolySitesoperator= (const PolySites &)
 
std::istream & read (std::istream &s)
 
std::ostream & print (std::ostream &stream) const
 output a tab-delimited array of positions and character states More...
 

Detailed Description

Polymorphism tables for sequence data.

This is one of the more useful classes in namespace Sequence. Its purpose is to take a bunch of data (a vector<Fasta> in fact), and turn it into a list of variable positions. It doesn't matter whether or not you have an outgroup in you vector (except that if you want to use it for later analysis, it had better be present).

The default behavior of this class is just to play with the std::strings themselves. So what you end up with is a vector of variable sites, stored in PolyData::positions, and a vector ofstd::strings containing the variable sites, stored in PolyData::data. Note that if you include an outgroup in your vector<Fasta>, and it contains a different character than the ingroup at some site, then the site is considered variable.

You can also try and turn the data into a "binary" (i.e. 0 and 1) format, by a call to Sequence::PolySites::Binary.
EXAMPLE:
Here is a common use of a PolySites class. You have a file, "gene.fasta", containing some number of sequences that represent polymorphism data. The file is assumed to be aligned, but we'll check for that, just in case you forgot to run ClustalW or something.

#include <string>
#include <iostream>
#include <Sequence/SeqExceptions.hpp>
int main(int argc, char *argv[]) {
const char *infile = "gene.fasta";
vector<Fasta> data;
try {
Sequence::Alignment::GetData (data,infile);
assert( Sequence::Alignment::IsAlignment (data) );
if ( Sequence::Alignment::Gapped (data) )
}
catch (SeqException &e)
{
cerr << "uh-oh! processing file gene.fasta resulted in throwing an exception"<<endl;
e.print(cerr);
cerr << endl;
exit(1);
}
PolySites *polytable = new PolySites(data);
}


Removing the terminal gaps guarantees that polymorphic site positions are labelled starting from the first ungapped position. Of course, a lot of the extra syntax in the example can be eliminated by giving the following 2 using declarations:
using namespace Sequence;
using namespace Sequence::Alignment;

For a second example, assume the data are in the file "gene.aln", the results of a ClustalW alignment.

#include <iostream>
#include <Sequence/SeqExceptions.hpp>
int main(int argc, char *argv[]) {
istream in;
in.open("gene.aln");
ClustalW<Fasta> aligned_data;
try {
in >> aligned_data;
assert(aligned_data.IsAlignment());
if(aligned_data.Gapped())
aligned_data.RemoveTerminalGaps();
}
catch (SeqException &e)
{
cerr << "uh-oh! processing file gene.aln resulted in throwing an exception"<<endl;
e.print(cerr);
cerr << endl;
exit(1);
}
PolySites *polytable = new PolySites(aligned_data.GetData());
}
Examples:
PolyTableIterators.cc, slidingWindow.cc, and slidingWindow2.cc.

Definition at line 33 of file PolySites.hpp.

Constructor & Destructor Documentation

◆ PolySites()

Sequence::PolySites::PolySites ( std::vector< double >  List,
std::vector< std::string >  stringList 
)

Use this constructor if you already have a list of positions and characters

Parameters
Lista list of doubles representing positions of polymorphic positions
stringLista vector of strings representing the polymorphic characters

Definition at line 134 of file PolySites.cc.

Member Function Documentation

◆ print()

std::ostream & Sequence::PolySites::print ( std::ostream &  stream) const

output a tab-delimited array of positions and character states

Allows objects of type Sequence::PolySites to be written to output streams. The output is a simple, tab-delimited table of variable site positions and characters

Note
segsite positions are output with the count starting from 1, not zero

Definition at line 191 of file PolySites.cc.


The documentation for this class was generated from the following files: