Example of time series analysis implemented in C++#
This page illustrates a few concepts:
fwdpy11 is extensible using both C++ and Python.
Often, the amount of C++ code to write is about as long as the equivalent Python
How to build plugins/extensions using cmake.
Using C++, we will write a callable that will record the mean trait value
of a population over time. In essence, we will build a small Python
extension module using pybind11 that will be called gvalue_recorder
. The
module will define a single Python function, gvalue_recorder.record_gvalue
.
That function will take a single Python list as an argument, and returns a Python
object known as a capsule, which is a Python object holding something defined in
another language (C++ in thise case). That capsule holds our callable, which follows the requirements described in
Time series analysis.
Let’s just look at the code:
#include <pybind11/pybind11.h>
#include <pybind11/functional.h>
#include <fwdpy11/types/DiploidPopulation.hpp>
#include <fwdpy11/evolvets/SampleRecorder.hpp>
PYBIND11_MODULE(gvalue_recorder, m)
{
m.def("record_gvalue", []() {
return pybind11::cpp_function(
[](const fwdpy11::DiploidPopulation &pop,
pybind11::list l) {
double mean_trait_value = 0.0;
for (auto &md : pop.diploid_metadata)
{
mean_trait_value += md.g;
}
l.append(mean_trait_value / static_cast<double>(pop.N));
});
});
}
The code is very simple:
The Python function is defined using a C++ lambda
The Python function returns a
pybind11::cpp_function
(which is a capsule) that contains another C++ lambda that “captures” the argument passed to the Python function.
We have a function returning a function, in other words.
The following Python script tests that our plugin behaves as expected:
import demes
import fwdpy11
import gvalue_recorder
N = 1000
yaml = """
time_units: generations
demes:
- name: pop
epochs:
- start_size: 1000
"""
graph = demes.loads(yaml)
demography = fwdpy11.ForwardDemesGraph.from_demes(graph, 1)
pdict = {
"demography": demography,
"simlen": 100,
"nregions": [],
"sregions": [fwdpy11.GaussianS(0, 1, 1, 0.25)],
"recregions": [fwdpy11.PoissonInterval(0, 1, 5e-3)],
"rates": (0.0, 5e-3, None),
"gvalue": fwdpy11.Additive(
2.0,
fwdpy11.GaussianStabilizingSelection.single_trait(
optima=[fwdpy11.Optimum(VS=1.0, optimum=1.0, when=0)]
),
),
"prune_selected": False,
}
params = fwdpy11.ModelParams(**pdict)
pop = fwdpy11.DiploidPopulation(N, 1.0)
rng = fwdpy11.GSLrng(42)
gvalues = []
rg = gvalue_recorder.record_gvalue()
def r(pop, _):
rg(pop, gvalues)
fwdpy11.evolvets(rng, pop, params, 100, r)
assert len(gvalues) == pop.generation, "Callback failure"
For the record, the Python equivalent to what we’ve generated in C++ is:
gvalues = []
def r(pop, sampler):
md = np.array(pop.diploid_metadata, copy=False)
gvalues.append(md["g"].mean())
I expect the performance of both methods to be nearly equivalent, largely because traversing the metadata once per generation is extremely fast compared to everything else going on in a simulation.
The plugin is built using cmake. Note the execute_process
steps, which find the fwdpy11 headers for you.
cmake_minimum_required(VERSION 2.8.12)
project(gavlue_recorder)
# As of 0.8.0, fwdpy11
# is compiled with the C++14 language
# standard (-std=c++14)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_CXX_STANDARD 14)
find_package(pybind11)
message(STATUS "Found pybind11: ${pybind11_VERSION}")
if(${pybind11_VERSION} VERSION_LESS '2.4.3')
message(FATAL_ERROR "pybind11 version must be >= '2.4.3'")
endif()
execute_process(COMMAND python3 -m fwdpy11 --fwdpy11_headers OUTPUT_VARIABLE FP11HEADERS)
execute_process(COMMAND python3 -m fwdpy11 --fwdpp_headers OUTPUT_VARIABLE FWDPPHEADERS)
find_package(GSL REQUIRED)
include_directories(BEFORE ${FP11HEADERS} ${FWDPPHEADERS})
message(STATUS "GSL headers in ${GSL_INCLUDE_DIRS}")
include_directories(BEFORE ${GSL_INCLUDE_DIRS})
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall")
pybind11_add_module(gvalue_recorder MODULE gvalue_recorder.cc)
target_link_libraries(gvalue_recorder PRIVATE GSL::gsl GSL::gslcblas)