libsequence  1.9.5
Sequence::PolyTableSlice< T > Class Template Reference

A container class for "sliding windows" along a polymorphism table. More...

#include <Sequence/PolyTableSlice.hpp>

Public Types

typedef std::pair< PolyTable::const_site_iterator, PolyTable::const_site_iterator > range
 Range of a window = [first,second)
 
typedef std::vector< range >::const_iterator const_iterator
 

Public Member Functions

 PolyTableSlice (const PolyTable::const_site_iterator beg, const PolyTable::const_site_iterator end, const unsigned &window_size_S, const unsigned &window_step_len)
 
 PolyTableSlice (const PolyTable::const_site_iterator beg, const PolyTable::const_site_iterator end, const unsigned nwindows)
 
 PolyTableSlice (const PolyTable::const_site_iterator beg, const PolyTable::const_site_iterator end, const double &window_size, const double &step_len, const double &starting_pos=0., const double &ending_pos=1.0)
 
const_iterator cbegin () const
 
const_iterator cend () const
 
get_slice (const const_iterator) const
 
std::vector< range >::size_type size () const
 
operator[] (const unsigned &) const
 

Detailed Description

template<typename T>
class Sequence::PolyTableSlice< T >

A container class for "sliding windows" along a polymorphism table.

This class is a simple container to store "sliding windows" along an object in the inheritance hierarchy of Sequence::PolyTable.

Sliding windows are used in population genetics to look at variation in levels of diversity along a region. This class supports two simple ways to make such windows. The first is the slide a window of some length (in base pairs) along your sequence, recording the SNPs in each window. The number of base pairs that you move the window each time is the "step length." The second type of window is to slide a window of a constant number of segregating sites along the SNP table. In the latter case, the step length is the number of segregating sites by which to move the beginning of the window each time. The two different constructors for this class correspond to these two different window types.

These two types of window are useful in different contexts, and it's up to the user to decide which one s/he wants. Please note that all this class does is facilitate the generation of the windows. It does not address any of the statistical headaches that arise from sliding window analyses. These issues include multiple test correction, non-independence of overlapping windows, variation in selective constraing along a sequence, and variation in power from window to window with respect to hypothesis testing.

The user should be aware that the approach used in Kreitman and Hudson (1991) "Inferring the evoltionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence." Genetics 127: 565 describe a clever variant of the sliding window. They slide along the physical sequence, but keep the number of synonymous/silent sites constant. Their procedure mitigates some of the difficulties mentioned above, but it is not implemented here because it relies on having an annotation for the SNP table available.

The user is also referred to Andolfatto, P., J. D. Wall and M. Kreitman, 1999 "Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster." Genetics 153:1397-1399, which discusses the multiple testing issue.

The following example reads in data from Hudson's program ms. Tajima's D is calculated for non-overlapping windows of size 0.1:

#include<iostream>
#include<cstdio>
int main(int argc, char **argv)
{
std::cin >> p;
Sequence::SimData d(p.totsam());
int i;
while( (i=d.fromstdin()) && i != EOF) //read simulated data from stdin
{
Sequence::PolyTableSlice<Sequence::SimData> windows(d.sbegin(),d.send(),0.1,0.1,0);
//The object only alows const iterations, and libsequence 1.8.5 changed
//the API to use the "C++11-ese" syntax of cbegin/cend"
while(itr < windows.cend()) //iterate over windows
{
//create a data object for the current window
SimData window = windows.get_slice(itr);
//calculate and print Tajima's D for the window
PolySIM analyze(&window);
std::cout << analyze.TajimasD() << '\t';
}
std::cout << std::endl;
}
}
Note
The two constructors are left ambiguous intentionally! See their documentation below. The ambiguity is so that the progammer is forced to thing about which type of slding window to use.
Examples:
slidingWindow.cc, and slidingWindow2.cc.

Definition at line 117 of file PolyTableSlice.hpp.

Member Typedef Documentation

◆ const_iterator

template<typename T>
typedef std::vector<range>::const_iterator Sequence::PolyTableSlice< T >::const_iterator

const_iterator type to access windows

Definition at line 209 of file PolyTableSlice.hpp.

Constructor & Destructor Documentation

◆ PolyTableSlice() [1/3]

template<typename T>
Sequence::PolyTableSlice< T >::PolyTableSlice ( const PolyTable::const_site_iterator  beg,
const PolyTable::const_site_iterator  end,
const unsigned &  window_size_S,
const unsigned &  window_step_len 
)
explicit

This constructor calculates sliding windows of a fixed number of segregating sites.

Parameters
begA pointer the first segregating site in the data
endA pointer to one-past-the-last segregating site in the data
window_size_SThe number of segregating sites in each window
step_lenThe number of segregating sites by which to "jump" for each new window
Note
In order to use this constructor, you must make sure that the compiler sees unsigned values, otherwise compilation will fail with an ambiguity error:
PolyTableSlice<PolySites> windows(data,100u,10u);
Exceptions
std::logic_errorif window_size_S or window_step_len == 0

◆ PolyTableSlice() [2/3]

template<typename T>
Sequence::PolyTableSlice< T >::PolyTableSlice ( const PolyTable::const_site_iterator  beg,
const PolyTable::const_site_iterator  end,
const unsigned  nwindows 
)
explicit

Create a specific number of windows with an equal number of segregating sites per window.

Parameters
begA pointer the first segregating site in the data
endA pointer to one-past-the-last segregating site in the data
nwindowsThe desired number of windows.
Note
The intended use of this fxn is to break an interval up into approximately equal-sized chunks. When end-beg is small relative to nwindows, you will end up with fewer than nwindows "slices". The primary use scenario envisioned for this type of window is downstream parallelization of computation on large PolyTable objects.

◆ PolyTableSlice() [3/3]

template<typename T>
Sequence::PolyTableSlice< T >::PolyTableSlice ( const PolyTable::const_site_iterator  beg,
const PolyTable::const_site_iterator  end,
const double &  window_size,
const double &  step_len,
const double &  starting_pos = 0.,
const double &  ending_pos = 1.0 
)
explicit

Use this constructor to generate a sliding window accross the sequence itself.

Parameters
begA pointer the first segregating site in the data
endA pointer to one-past-the-last segregating site in the data
window_sizeThe size of the sliding window (in units of physical distance)
step_lenThe distance by which the window jumps (in units of physical distance)
starting_posThe starting position for your data.
ending_pos.The last position for the data.
Note
For most situations involving "ms-like" data, and probably for most genomic data, a starting_pos of 0 is appropriate. However, there are situations where your data may be something like a segment from the middle of a genome, and your SNP positions are annotated with respect to the reference contig. In that case, starting_pos is best set to the appropriate position along the reference, else you will be returned a lot of empty windows should you start from 0. Likewise, for "normal" ms runs, ending_pos should be 1.0. For other scenario, such as "real" data, you'll have to set starting_pos and ending_pos to the appropriate values. In order to use this constructor, you must make sure that the compiler sees doubles, otherwise compilation will fail with an ambiguity error:
PolyTableSlice<SimData> windows(data,0.1,0.01);
Exceptions
std::logic_errorif window_size or step_len <= 0.

Member Function Documentation

◆ cbegin()

template<typename T>
const_iterator Sequence::PolyTableSlice< T >::cbegin ( ) const
Returns
Const iterator to begin
Examples:
slidingWindow.cc.

◆ cend()

template<typename T>
const_iterator Sequence::PolyTableSlice< T >::cend ( ) const
Returns
Const iterator to end

◆ get_slice()

template<typename T>
T Sequence::PolyTableSlice< T >::get_slice ( const const_iterator  ) const
Parameters
itrAn iterator from the current object
Returns
The window pointed to by the iterator itr.
Exceptions
std::out_of_rangeif itr is out of range

◆ operator[]()

template<typename T>
T Sequence::PolyTableSlice< T >::operator[] ( const unsigned &  ) const
Parameters
iThe window to return, 0 <= i < object.size()
Returns
the i-th window
Exceptions
std::out_of_rangeif i is out of range

◆ size()

template<typename T>
std::vector<range>::size_type Sequence::PolyTableSlice< T >::size ( ) const
Returns
The number of windows stored

The documentation for this class was generated from the following file: