Importing mutations from tskit#

This vignette shows how to add mutations to a tree sequence generated by msprime and import them into fwdpy11. The key idea is that we add mutations to the tree sequence using the fwdpy11 metadata schema.

import msprime
import numpy as np
import tskit

ts = msprime.sim_ancestry(100,
                          recombination_rate=0,
                          sequence_length=1e6,
                          population_size=100,
                          random_seed=54321)
tables = ts.tables

Add the fwdpy11 mutation metadata schema:

import fwdpy11

tables.mutations.metadata_schema = fwdpy11.tskit_tools.metadata_schema.MutationMetadata

Now, we add a single mutation on the first node we encounter that has at least 10 sample descendants.

We will add a mutation with a dominance of 1.0 and a selection coefficient of 0.01.

tree = ts.first()
for node in tree.nodes():
    if tree.num_samples(node) >= 10:
        # NOTE: mutations times must 
        # fall in the half open interval
        # of [node time, node's parent node time)
        # Here, we crudely make sure this happens
        # while also respecting fwdpy11's requirement
        # that times are integer-values
        ntime = ts.node(node).time
        ptime = ts.node(tree.parent(node)).time
        if ptime - ntime >= 2:
            time = int(ntime + (ptime - ntime)/2.0)
            md = {'s': 0.01,
                  'h': 1.0,
                  'origin': time,
                  'label': np.uint16(0),
                  # NOTE: always write the
                  # next 2 lines as shown here.
                  # The fwdpy11 back end will do
                  # the right thing.
                  # A future release will provide a
                  # nicer API so that you only need
                  # to provide the previous 2 fields.
                  'neutral': 0,
                  'key': np.uint64(0)
                 }
            site = tables.sites.add_row(tables.sequence_length // 2, '0')
            tables.mutations.add_row(site, node,
                                     '1', time=time,
                                     metadata=md)
            break

tables.sort()
ts_with_muts = tables.tree_sequence()

Create a population and lift over the mutations:

pop = fwdpy11.DiploidPopulation.create_from_tskit(ts_with_muts, import_mutations=True)

Print the number of times each mutation appears in the population and its internal data:

for c, m in zip(pop.mcounts, pop.mutations):
    print(c, "->", m)
48 -> Mutation[position:500000.000000, effect size:0.010000, dominance:1.000000, origin time:-180, label:0]