S1 Appendix. - PLOS

Detailed description of the mathematical model Model Assumptions

Consider neutral insertions of retroelements in generation t = s (t = time measured in generations). We take into account three different events: ⋅

ω1 - an orthologous retroelement is present in a genomic locus of A and B but absent in C;

⋅

ω2 - an orthologous retroelement is present in a genomic locus of A and C but absent in B;

⋅

ω3 - an orthologous retroelement is present in a genomic locus of B and C but absent in A.

We denote:

q1 ( s), q 2 ( s), q3 ( s) as the probabilities of these events,

µ1 ( s), µ 2 ( s), µ 3 ( s) as the numbers of these events, ν (s ) as the number of all insertions of retroelements in generation s.

Then, when we assume that all insertions are independent events, we have a scheme of independent trials with four outcomes distributed multinomially: n! q1m1 q 2m2 q3m3 q 4n − m , if n ≥ m, (S1.1) Ρ(µ1 ( s ) = m1 , µ 2 ( s ) = m2 , µ 3 ( s ) = m3 ν ( s ) = n ) = m1!m2 !m3 !(n − m )! 0, if n < m where

q 4 = q 4 ( s ) = 1 − q1 ( s ) − q 2 ( s ) − q3 ( s ), m = m1 + m2 + m3

.

(S1.2)

For a shorter form of the equations, the dependence of probabilities on s is omitted. Hence, according to the total probability formula: n! q1m1 q2m2 q3m3 q4n − m Ρ (ν ( s ) = n) = n ≥ m m1 ! m2 ! m3 !( n − m ) ! (S1.3) 1 n! m1 m2 m3 n−m q1 = q2 q3 ∑ q4 Ρ (ν ( s ) n ) = m1 !m2 !m3 ! n ≥ m ( n − m)! m1 , µ2 ( s ) = m2 , µ3 ( s ) = m3 ) = Ρ ( µ1 ( s ) = ∑

If the probability of new insertions of retroelements in the genome of each particular member of the population α (s ) is small, then we can assume that ν (s ) is a Poisson distributed random variable with a mean n0 = n0(s), proportional to the effective population size N (s ) , that is:

Ρ(ν (s ) = n ) = where

n

n0 − n0 e n!

(S1.4)

n0 = N ( s )α ( s ) .

Thus

(q n ) n! q 4n − m Ρ(ν ( s ) = n ) = ∑ 4 0 n0m e − n0 = n0m e −(1− q4 )n0 ∑ k! n ≥ m ( n − m)! k ≥0 k

1

therefore, denoting:

b j = b j ( s ) = n0 ( s ) q j ( s )

(S1.5)

we will get:

Ρ(µ1 ( s ) = m1 , µ 2 ( s ) = m2 , µ 3 ( s ) = m3 ) =

1 b1m1 b2m2 b3m3 e −(b1 +b2 +b3 ) . m1!m2 !m3 !

It follows that µ1 ( s ), µ 2 ( s ), µ 3 ( s ) represent independent, random, Poisson distributed variables with the parameters b1 ( s ), b2 ( s ), b3 ( s ) , respectively. Now we consider all possible generations with potential retroposon insertions that later become phylogenetically informative. The set of corresponding values of time s we have denoted by S. Then the total numbers of retroposon insertions with properties ω j :

ξ j = ∑ µ j (s)

( j = 1, 2, 3)

s∈S

are independent random, Poisson distributed variables with parameters:

a j = ∑ b j ( s ) = ∑ n0 ( s ) q j ( s ) . s∈S

(S1.6)

s∈S

We consider the random variable ηj as the number of events ωj observed in our experiment. If their total number:

η1 + η 2 + η 3 = n

(S1.7)

is fixed, then, in compliance with the proposed model, the random variables η1, η2, η3 are distributed according a polynomial distribution:

Ρ(η1 = y1 ,η 2 = y2 ,η3 = y3 ) =

n! p1y1 p2y 2 p3y3 , ( y1 + y2 + y3 = n) y1! y2! y3!

(S1.8)

where pj =

aj a1 + a 2 + a3

.

(S1.9)

1. Binary tree Under the term C-tree we consider a scenario where at time t0 a common ancestral population separated into two isolated branches that no longer interbreed (Fig. 2a). The first branch, at some time T1 (t1 = t0 + T1), also separated into two lineages A and B. The second branch forms lineage C. We take one certain marker locus (a locus in genomes containing an insertion of a retroelement). Denoting with Х(t) its frequency in the population at the time t, using the standard Wright-Fisher coalescent model ((Fisher 1922); (Wright 1931)), we can consider Х(t) as a Markov process with the transition function u ( s, p, t , x) , reflecting the conditional probability density of X(t) 2

for condition X(s) = p. Under diffusion approximation, u ( s, p, t , x) obeys the forward Kolmogorov’s equation:

∂u 1 ∂2 [x(1 − x)u ] , = ∂t 4 N (t ) ∂x 2

(S1.10)

where the initial condition u ( s, p, s, x) = δ ( x − p) is a Dirac delta function, and N (t ) >> 1 denotes the effective population size ((Kimura 1955a)). The solution of this equation was first proposed by Kimura and afterwards in a global explanation by Tran et al. ((Tran, Hofrichter, and Jost 2013)) (for the case: N(t) = const), and is represented in the form of a series including Gegenbauer polynomials. We do not use this solution because, for our purposes, it is sufficient to know some moments of distribution of X(t). We denote 1

mk ( s, p, t ) = ∫ x k u ( s, p, t , x)dx

(S1.11)

0

as the k-th (conditional) moment of distribution of X(t) about 0. Following Kimura (Kimura 1955b) we can write

m1 ( s, p, t ) = p

(S1.12)

and m k ( s, p, t ) is a solution of the next differential equation:

d k (k − 1) (mk ( s, p, t ) − mk −1 ( s, p, t ) ) . m k ( s, p, t ) = − 4 N (t ) dt

(S1.13)

Instead of t we introduce the new independent variable t

τ = τ ( s, t ) = ∫ s

dt , as the “drift time”, according to Waxman (Waxman 2011). (S1.14) 2 N (t )

Then we can write equation (S1.13) in the form:

dmk k (k − 1) (mk − mk −1 ) =− 2 dτ with an initial condition of: m k

τ =0

= pk

(S1.15)

.

We also need the second and third moments. Solving the equation (S1.15) for k = 2 and k = 3, and taking into account (S1.12), we obtain: m2 ( s, p, t ) = p − p (1 − p )e −τ ( s ,t ) .

m3 ( s , p , t ) = p −

3 1  p (1 − p )e −τ ( s ,t ) + p (1 − p ) − p e −3τ ( s ,t ) 2 2 

(S1.16) (S1.17)

3

(The same result may be obtained from the work of Kimura (Kimura 1955b) cited above, and also by using the solution of the diffusion equation proposed by Tran et al. ((Tran, Hofrichter, and Jost 2013)) if instead of t, we take the “drift time” τ). As it follows from the Kimura notation, the conditional probability that the retroposon insertion, with a frequency at generation s equal to p, will be fixed in the population, tends to p with t → ∞ (the probability of loss approaches 1 – p). Consider some retroposon insertions into some loci in generation s < t0. We introduce the random

= X0 ( X 0 , X1 ) , where

vector

X= (t0 ), X 1 X (t1 ) .

With

the

fixed

arbitrary

values x0 , x1 (0 ≤ x0,1 ≤ 1) , we can evaluate the conditional probabilities of ω1 , ω2 , ω3 accordingly:

x0 , X 1 = x1 ) = Ρ (ω1 X 0 = (1 − x0 ) x12 , x0 , X 1 == x1 ) Ρ (ω3 X 0 = x0 , X 1 == x1 ) x0 x1 (1 − x1 ) Ρ ( ω2 X 0 = Then, denoting f (x0 , x1 ) as the probability density of the random vector

.

(X 0 , X1 ) ,

(S1.18) the total

probability formula will be: 1 1

Ρ(ω1 ) = ∫ ∫ (1 − x0 )x12 f ( x0 , x1 )dx0 dx1 , 0 0

,

1 1

(S1.19)

Ρ(ω2 ) = Ρ(ω3 ) = ∫ ∫ x0 x1 (1 − x1 ) f ( x0 , x1 )dx0 dx1 0 0

and f ( x0 , x1 ) transfers to:

f (x 0 , x1 ) = u ( s, p, t 0 , x 0 )u (t 0 , x 0 , t1 , x1 ) ,

(S1.20)

where

p=

1 . 2 N (s)

(S1.21)

Thus 1

Ρ(ω1 ) = ∫ (1 − x 0 )m 2 (t 0 , x 0 , t1 )u ( s, p, t 0 , x 0 )dx 0 , 0

(S1.22)

1

Ρ(ω 2 ) = Ρ(ω 3 ) = ∫ x 0 (m1 (t 0 , x 0 , t1 ) − m 2 (t 0 , x 0 , t1 ) )u ( s, p, t 0 , x 0 )dx 0 0

Hence: m1 (t 0 , x0 , t1 ) = x0 m2 (t 0 , x0 , t1 ) = x0 − x0 (1 − x0 )e −τ 1

,

(S1.23)

where

4

t1

τ 1 = τ (t0 , t1 ) = ∫ t0

dt . 2 N (t )

(S1.24)

Then we obtain:

Ρ= (ω1 )

( m1 ( s, p, t0 ) − m2 ( s, p, t0 ) ) (1 − e−τ ) + ( m2 (s, p, t0 ) − m3 (s, p, t0 ) ) e−τ , Ρ ( ω2 ) = Ρ (ω3 ) = ( m2 ( s, p, t0 ) − m3 ( s, p, t0 ) ) e−τ . 1

1

(S1.25)

1

Now, according to (S1.16-S1.17) with p defined by (S1.21) and neglecting the terms of order р2 and higher (assuming that N (t ) >> 1 ) we can write:

 1 −τ1  −τ ( s ,t0 ) 1 −τ1 −3τ ( s ,t0 ) q1 ( s ) = Ρ (ω1 ) = − e pe , 1 − e  pe 2  2  p −τ ( s ,t0 ) −3τ ( s ,t0 ) q2 ( s ) = q3 (t ) = Ρ ( ω2 ) = Ρ (ω3 ) = e −τ1 e −e . 2

(

(S1.26)

)

Recall that s ≤ t 0 . Suppose now that t0 < s ≤ t1 (a retroposon insertion occurs on branch 1). This marker will not appear in lineage С, hence q 2 ( s ) = q3 ( s ) = 0 . Let us evaluate q1 ( s ) = Ρ(ω1 ) .

(

)

2 Noting that Ρ ω1 X 1 = x1 = x1 , according to the total probability formula we get: 1

q1 ( s ) = Ρ(ω1 ) = ∫ x12 u ( s, p, t1 , x1 )dx1 = m2 ( s, p, t1 ) .

(S1.27)

0

Hence we finally can write:

q1 ( s )= p − pe −τ ( s ,t1 ) ,

(S1.28)

q= q= 0. 2 (s) 3 (s) (here t 0 < s ≤ t1 ). To find the parameters, a1 , a2 , a3 , according to (S1.9):

a j = ∑ α ( s) N ( s)q j ( s) .

(S1.29)

s∈S

Assuming that the corresponding functions, as s increases by 1 (the transition to the next generation), are slowly changing, we replace the summation by an integration over the appropriate intervals. Then:

t

t

t

0 1 1 −τ1 0  1 −τ1  −τ ( s ,t0 ) −3τ ( s ,t0 ) − − + 2a1 = 1 ( ) ( ) e α s e ds e α s e ds α ( s ) 1 − e −τ ( s ,t1 ) ds,  ∫ ∫ ∫ 2  2  −∞ −∞ t0

t

(

(

)

, (S1.30)

)

1 −τ1 0 2= a2 2= a3 e ∫ α ( s ) e −τ ( s ,t0 ) − e −3τ ( s ,t0 ) ds. 2 −∞ which, introducing the notation:

5

Z1 =

t0

∫ α (s)e

−τ ( s ,t0 )

ds,

−∞

Z2 =

t0

∫ α (s)e

−3τ ( s ,t0 )

,

ds,

(S1.31)

−∞

= Z3

∫ α (s) (1 − e t1

−τ ( s ,t1 )

) ds,

t0

can be written:

1 −τ1  1 −τ1  a1 = 1 − e  Z1 − e Z 2 + Z 3 , 2  2  1 −τ1 a= a= e ( Z1 − Z 2 ) , 2 3 2

(S1.32)

and now, similarly to Kimura (Kimura 1955b; Kimura 1955a), we assume for the all intervals α(t) a constant effective population size. Then, with s ≤ t 0 :

Z1 2= N 0α 0 n0 = t0 − s and τ ( s, t 0 ) = 2 1 ,, 2N 0 Z2 = N 0α 0 n0 = 3 3

(S1.33)

where n0 = 2 N 0α 0 is the average number of insertions per generation at the ancestral branch and N0 is the effective population size. Next, with t 0 < s ≤ t1 :

τ ( s , t1 ) =

t1 − s , 2 N1

T τ 1 = τ (t 0 , t1 ) = 1 2 N1

(

(

)) (

)

and Z 3 = α1 T1 − 2 N1e 1 − e −τ 1 = n1 τ 1 − 1 + e −τ 1 ,

(S1.34)

where n1 = 2 N1α1 is the average number of insertions per generation at the branch 0-1 and N1 the effective population size of this branch. Thus, introducing the function:

Φ (τ ) = τ − 1 + e −τ ,

(S1.35)

  2 a1 = 1 − e −τ1 n0 + n1Φ (τ 1 ),   3 1 a 2 = a3 = e −τ 1 n0 . 3

(S1.36)

according to (S1.32), we obtain:

Now, following (S1.9):

6

2 −τ1 n1 + Φ (τ 1 ) 3e n0 , n 1 + 1 Φ (τ 1 ) n0

1− p1 =

(S1.37)

−τ1

p= p= 2 3

1 e , n 3 1+ 1 Φ τ ( 1) n0

or, denoting e −τ Ψ (τ ) = , n 1 + 0 τ + e −τ − 1 n1

(

)

(S1.38)

we have

2 p1 =1 − Ψ (τ 1 ) , 3 1 p2= p3= Ψ (τ 1 ) . 3

Hence p1 >

(S1.39)

1 1 and p2 = p3 < where Ψ (τ ) < 1 . 3 3

7

2. Ancestral hybridization Let us now consider a model that includes ancestral hybridization (Fig. 2e). As in the previous case we assume that at time t=t0 the common ancestral population (branch 0) separated into two isolated branches. Later, after T1 and T2 generations, subpopulations of each of the two branches separated from their parent branches and reproduced with one another by fusion, forming a new branch B. The original two separating branches represent lineages A and C. The proportions of the two subpopulations in the newly joined population are denoted by γ1 and γ2 (γ1 + γ2 =1). Then:

N13 = γ 1 N 3 , N 23 = γ 2 N 3 , N13 + N 23 = N 3

.

(S1.40)

Consider a retroposon insertion at a specific locus at s < t0. We introduce the random

= X 0 X= (t0 ), X 1 X 01= (t1 ), X 2 X 02 (t2 ) . vector ( X 0 , X 1 , X 2 ) , where

Fixing

the

arbitrary

values x0 , x1 , x2 (0 ≤ x0,1, 2 ≤ 1) , we can write the conditional probabilities of the events ω1 , ω2 , ω3 respectively: Ρ(ω1 X 1 = x1 , X 2 = x 2 ) = x1 (γ 1 x1 + γ 2 x 2 )(1 − x 2 ),

Ρ(ω 2 X 1 = x1 , X 2 = x 2 ) = x1 (γ 1 (1 − x1 ) + γ 2 (1 − x 2 ) )x 2 . Ρ(ω 3 X 1 = x1 , X 2 = x 2 ) = (1 − x1 )(γ 1 x1 + γ 2 x 2 )x 2

(S1.41)

Then, denoting f ( x0 , x1 , x2 ) as the probability density of the random vector ( X 0 , X 1 , X 2 ) , according to the total probability formula we get: Ρ= (ω1 )

1 1 1

∫ ∫ ∫ x (γ x + γ x )(1 − x ) f ( x , x , x )dx dx dx , 1

1 1

2 2

2

0

1

2

0

1

2

0 0 0

= Ρ ( ω2 )

1 1 1

∫ ∫ ∫ x (γ (1 − x ) + γ 1

1

1

2

(1 − x2 ) ) x2 f ( x0 , x1 , x2 )dx0 dx1dx2 ,

(S1.42)

0 0 0

Ρ (ω3 ) =

1 1 1

∫ ∫ ∫ (1 − x ) (γ x + γ x ) x 1

1 1

2 2

2

f ( x0 , x1 , x2 )dx0 dx1dx2 .

0 0 0

f ( x0 , x1 , x2 ) transfers to:

f ( x0 , x1 , x2 ) = u0 ( s, p, t0 , x0 )u1 (t0 , x0 , t1 , x1 )u2 (t0 , x0 , t2 , x2 ),

(S1.43)

where u 0 ( s, p, t 0 , x0 ), u1 (t 0 , x0 , t1 , x1 ), u 2 (t 0 , x0 , t 2 , x 2 ) are transitional functions for the respective branches.

8

Note that using the relations (S1.12) and (S1.16) we can write: 1

x , t , x )dx ∫ x u (t , = j

j

0

0

j

j

j

( j) m= x0 , 1 (t0 , x0 , t j )

0

,

1

∫ x u (t , x , t 2 j

0

j

0

j

, x j )dx j =m2( j ) (t0 , x0 , t j ) =x0 − x0 (1 − x0 )e

−τ j

(S1.44)

,

0

where tj

τj =∫ t0

dt , 2 N j (t )

(S1.45)

j ∈ {1, 2}. Hence: Ρ (ω = 1)

∫ (γ ( x 1

1

0

(

)

)

− x0 (1 − x0 )e −τ1 (1 − x0 ) + γ 2 x02 (1 − x0 )e −τ 2 u ( s, p, t0 , x0 )dx = 0

0

)

= γ 1 (m1 − m2 )(1 − e −τ1 ) + (m2 − m3 )e −τ1 + γ 2 (m2 − m3 )e −τ 2 , Ρ (ω2= )

1

∫ (γ x (1 − x )e 2 1 0

0

(

= γ 1e

−τ1

Ρ (ω= 3)

−τ1

0

(S1.46)

)

+ γ 2 x02 (1 − x0 )e −τ 2 u ( s, p, t0 , x0 )dx= 0

)

+ γ 2 e −τ 2 (m2 − m3 ),

∫ (γ x (1 − x )e 1

2 1 0

0

0

−τ1

(

)

)

+ γ 2 x0 − x0 (1 − x0 )e −τ 2 (1 − x0 ) u ( s, p, t0 , x0 )dx 0 =

(

)

= γ 1 (m2 − m3 )e −τ1 + γ 2 (m1 − m2 )(1 − e −τ 2 ) + (m2 − m3 )e −τ 2 ,

wherein, using (S1.16)-( S1.17) and neglecting the terms of order р2 and higher:

m1 = m1 (t , p, t0 ) = p m2 = m2 ( s, p, t 0 ) = p − pe −τ ( s ,t0 ) .

m3 = m3 ( s , p , t 0 ) = p −

(S1.47)

3 −τ ( s ,t0 ) 1 −3τ ( s ,t0 ) + pe pe 2 2

Thus:

 1  1 p  Ρ (ω1 ) = q1 ( s ) = γ 1  1 − e −τ1  pe −τ ( s ,t0 ) − e −τ1 pe −3τ ( s ,t0 )  + γ 2 e −τ 2 e −τ ( s ,t0 ) − e −3τ ( s ,t0 ) , 2 2   2  p q2 ( s ) = Ρ ( ω2 ) = γ 1e−τ1 + γ 2 e −τ 2 e−τ ( s ,t0 ) − e−3τ ( s ,t0 ) , (S1.48) 2  1  p 1  q3 ( s ) = Ρ (ω3 ) = γ 1 e −τ ( s ,t0 ) − e −3τ ( s ,t0 ) e −τ1 + γ 2  1 − e −τ 2  pe−τ ( s ,t0 ) − e −τ 2 pe −3τ ( s ,t0 )  , 2 2   2 

(

)(

(

(

)

)

)

(here s ≤ t 0 ).

9

Note that when γ 1 = 1, γ 2 = 0 (C-tree (see equation 6 in the Manuscript), Fig 2a) this result coincides with (S1.29), and for γ 1 = 0, γ 2 = 1 (A-tree (see equation 8 in the Manuscript), Fig 2b) we obtain similar formulas, where

q1 ( s )

is replaced by q 3 ( s) , and τ1 is replaced by τ2.

Suppose now that a retroposon insertion occurs on the branch 0-1 at s ∈ (t 0 , t1 ) . Then, it will not appear in the lineage С, and hence q 2 ( s ) = q3 ( s ) = 0 . We evaluate q1 ( s ) = Ρ(ω1 ) . Noticing that

Ρ(ω1 X 1 = x1 ) = γ 1 x12 , the corresponding result obtained by multiplying the right side of equation (S1.28) by γ1. Thus:

(

)

q1 ( s ) = γ 1 p − pe −τ ( s ,t1 ) , q 2 ( s) = q3 ( s) = 0

(S1.49)

Here t 0 < s ≤ t1 ,

p=

1 2 N1 (s)

(S1.59)

and t1

dt . 2 N 1 (t )

τ ( s , t1 ) = ∫ s

(S1.51)

Processing similarly with the retroposon inserted on branch 0-2, we obtain:

(

)

q3 ( s ) = γ 2 p − pe −τ ( s ,t2 ) , q1 ( s ) = q 2 ( s ) = 0

(S1.52)

Here t 0 < s ≤ t 2

p=

1 2 N 2 (s)

(S1.53)

and t1

τ ( s, t 2 ) = ∫ s

dt . 2 N 2 (t )

(S1.54)

Next proceeding as in (S1.29 - S1.36), we can write:

  2 n  a1 =  1 − e −τ1 n0 + n1Φ (τ 1 ) γ 1 + 0 e −τ 2 γ 2 3    3 n a2 = 0 e −τ1 γ 1 + e −τ 2 γ 2 . 3   2 n  a3 = 0 e −τ1 γ 1 +  1 − e −τ 2 n0 + n2 Φ (τ 2 ) γ 2 3    3

(

)

(S1.55)

10

Now, according to (S1.9):

 2 −τ1 n1  1 −τ 2 1 − e + Φ (τ 1 )  γ 1 + e γ 2 3 3 n0  , p1 =  n1 n2 1 + Φ (τ 1 ) γ 1 + Φ (τ 2 ) γ 2 n0 n0 −τ

p2 =

−τ

1 2 1 e γ1 + e γ 2 , 3 1 + n1 Φ τ γ + n2 Φ τ γ ( 1) 1 ( 2) 2 n0 n0

(S1.56)

 2  1 −τ1 n e γ 1 + 1 − e −τ 2 + 2 Φ (τ 2 )  γ 2 3 n0  3  . p3 = n1 n2 1 + Φ (τ 1 ) γ 1 + Φ (τ 2 ) γ 2 n0 n0 When either γ1 or γ2 are equal to 0, we obtain an A-tree ((see equation 8 in the Manuscript), Fig 2b) or a C-tree ((see equation 6 in the Manuscript), Fig 2a), respectively. Note that:

(( ) = ((1 − e )n

)

a1 − a2 = 1 − e −τ 1 n0 + n1Φ (τ 1 ) γ 1 a3 − a2

−τ 2

)

0 + n2 Φ (τ 2 ) γ 2

.

(S1.57)

If γ1,2 are not equal to 0, a1 > a2 and a3 > a2 (accordingly: p1 > p2 and p3 > p2 ). In the case of Cfusion (splits from A and B fuse), р1 will exchange places with р2, and in the case of A-fusion (splits from B and C fuse), р3 will exchange places with р2. Consider the case:

(1 − e )n −τ 1

0

(

)

+ n1Φ (τ 1 ) = 1 − e −τ 2 n0 + n2Φ (τ 2 )

(S1.58)

(this holds in particular if n1 = n2 = n0 , τ 1 = τ 2 ). Then: 1 −τ  2 −τ  1 − e + Φ (τ )  γ 1 + e γ 2 3 3  , p1 =  1 + Φ (τ ) −τ

p2 =

1 e 3 1 + Φ (τ )

(S1.59)

1 −τ  2  e γ 1 + 1 − e −τ + Φ (τ )  γ 2 3 3   p3 = 1 + Φ (τ )

11

These equations can also be written as:

(1 − 2 p2 )γ 1 + p2γ 2 , p1 = −τ

1 e , p2 = 3 1 + Φ (τ )

(S1.60)

p3= p2γ 1 + (1 − 2 p2 )γ 2 .

3. One-directional search Now we consider the case when only two events, ω1 and ω2, can be observed. Thus there are only two random variables: η1 and η2. If their total number

η1 + η 2 = n

(S1.61)

is fixed, then we have a binomial distribution:

Ρ(η1 = y1 ,η 2 = y 2 ) =

n! p1y1 p 2y2 , ( y1 + y 2 = n) , y1! y 2 !

(S1.62)

where

p1 =

a2 a1 , p2 = a1 + a 2 a1 + a 2

( p1 + p 2 = 1).

In the case of a C-tree, according to (S1.36, 6) a1 > a 2 , hence p1 >

(S1.63)

1 . 2

Similarly for a B-tree (see equation 7 in the Manuscript, Fig. 2b) a1 < a 2 , therefore p1 < The case of the A-tree (see equation 8 in Manuscript) leads to p1 =

1 . 2

1 , but it is necessary to note 2

that the same situation occurs when we have an ABC-tree (see equation10 in the Manuscript, Fig. 2d) (polytomy). In the case of B-fusion, in accordance with the remark following (S1.57), we also have p1 >

1 1 and in the case of C-fusion p1 < . However, for A-fusion, the relationship between p1 2 2

and p2 may be arbitrary.

Supplementary References Fisher, R. A. 1922. On the dominance ratio. P Roy Soc Edinb 42:321-241. Kimura, M. 1955a. Stochastic processes and distribution of gene frequencies under natural selection. Cold Spring Harb Symp Quant Biol 20:33-53. Kimura, M. 1955b. Solution of a Process of Random Genetic Drift with a Continuous Model. Proc Natl Acad Sci U S A 41:144-150. Tran, T. D., J. Hofrichter, and J. Jost. 2013. An introduction to the mathematical structure of the Wright-Fisher model of population genetics. Theory Biosci 132:73-82. Waxman, D. 2011. A unified treatment of the probability of fixation when population size and the strength of selection change over time. Genetics 188:907-913. Wright, S. 1931. Evolution in Mendelian Populations. Genetics 16:97-159. 12