Text S1 - PLOS

Text S1: Deriving the Multiallelic Hot-or-Not Model The greatest difficulty in applying the Wright-Dirichlet distribution to data analysis is computing the normalization constant. Various methods are available, including importance sampling and Taylor series expansion [1]. In a multiallelic PIMS model of directional selection with no dominance, the Wright-Dirichlet distribution simplifies to K

p ( f ) ∝ ∏ eγ i fi fiθi −1 ,

(A1)

i =1

where γi is the population-scaled selection coefficient for allele i. The normalizing constant for this distribution is related to the characteristic function for a Dirichlet distribution with parameter θ. In addition to the methods mentioned above, it can be computed numerically using contour integration in the complex plane [2, 3] K

∫ ∏e

Φ K i =1

γ i fi

fi

θi −1

∏ df =

K i =1

Γ (θ i )

2π I

K

t ∫ e ∏ (t − γ i ) L

−θi

dt ,

(A2)

i =1

where I represents the imaginary unit and ΦK is a (K – 1)-simplex. The contour L is a loop beginning and ending at −∞ , and encircling in the positive direction (anticlockwise when the x and y axes correspond to the real and imaginary lines respectively) all the finite singularities of the integrand (i.e. when t = γi). We have Mathematica code that performs this integration (available on request) but found it to be numerically unstable. Under the hot-or-not model, however, we can use Equation A2 to show that

eγ F ∏ i =1 fiθi −1 K

p(f ) =

Text S1

Β( θ ) 1 F1 ( Θ H ,Θ, γ )

,

(A3)

1

where γ is the population-scaled selective difference between fitness classes, F is the total frequency of alleles in the favored (hot) fitness class, ΘH is the total mutation rate to alleles of that class and Θ is the total mutation rate for all alleles. Β( θ ) is the beta function with vector argument, so that Β( θ ) = ∏ i =1 Γ (θ i ) Γ K

(∑ θ ) , K

(A4)

i =1 i

and 1 F1 ( a,b, c ) is the confluent hypergeometric function. Let H be the set of favored alleles, and N the set of disfavored alleles. The numbers of alleles in the two classes are KH and KN. Define h = { fi / F : i ∈H } , g = { fi / (1 − F ) : i ∈N } , θ( H ) = {θ i : i ∈H } and θ( N ) = {θ i : i ∈N } . By making the

change of variables ( f ) → ( F, g, h ) , Equation A3 can be factorized so that

⎡ eγ F F Θ H −1 (1 − F )Θ N −1 ⎤ p ( F, g, h ) = ⎢ ⎥ ⎢⎣ Β( Θ H ,Θ N ) 1 F1 ( Θ H ,Θ, γ ) ⎥⎦ ⎡ 1 ×⎢ ⎢ Β θ( N ) ⎣

KN

( )∏ i =1

(N )

θi i

g

−1

⎤⎡ 1 ⎥⎢ ⎥ ⎢ Β θ( H ) ⎦⎣

KH

( )∏ i =1

(H )

θi i

h

−1

⎤ ⎥, ⎥ ⎦

(A5)

where Θ N = Θ − ΘH . This demonstrates that F follows a biallelic Wright-Dirichlet distribution with parameters (ΘH, ΘN, γ), and g and h follow independent Dirichlet distributions with parameters θ( N ) and θ( H ) respectively. The Dirichlet distribution arises from the neutral case, and this factorization suggests that in a hot-or-not model, evolution within class H or N can be characterized as neutral. This in turn supports our assumption (see main text) that, in the low-mutation limit, the probability of fixation of

Text S1

2

allele A equals the fixation probability for the whole class multiplied by the neutral fixation probability for allele A within its class. To condition the stationary distribution on identity of the ancestral allele, suppose allele A is ancestral. In a recurrent selection model, the ancestral allele is disfavored. Thus from Equations A5 and the form of Equation 11 where A is disfavored,

1− e ( ))F (1 − F ) g ∏ ( p ( F, g, h | A ) = ( ) )ϕ (1 − ϕ ) dϕ ∫ ζ ∏ ∫ (1 − e − γ 1− F

Θ N −1

Θ H −1

A

1

− ϕ 1− F

Θ H −1

Θ N −1

KN

A

(1 − e =

)F

i =1

ΦK N

0

KN i =1

(N )

(N )

giθi

(N )

ζ iθi

−1

∏

−1

dζ

Θ H −1

(1 − F )Θ

N

−1

KN

KH

(H )

( ) ( )

(H )

hiθi

∫∏

ΦK H

gA ∏ i =1 giθi −1 ∏ i =1 hiθi θ Β( Θ H ,Θ N ) ⎡⎣1 − 1 F1 ( Θ N ,Θ, −γ ) ⎤⎦ A Β θ( N ) Β θ( H ) ΘN − γ (1− F )

KH i =1

−1

KH i =1

(H )

ηiθi

−1

dη

−1

.

(A6) Reversing the change-of-variables,

(

)

− γ (1− F ) f θi −1 ∏ fA Θ N 1 − e i =1 i p ( f | A) = . 1 − F θ A Β( θ ) ⎡⎣1 − 1 F1 ( Θ N ,Θ, −γ ) ⎤⎦ K

(A7)

Assuming that alleles are sampled at random with replacement from the population, we can utilize the multinomial distribution to obtain the conditional likelihood for a sample of size n comprising xi copies of allele i, given that the ancestral allele is A. p (x | A) =

⎛ n ⎞ K xi fi p ( f | A ) d f ∫ ⎜⎝ x⎟⎠ ∏ i =1 ΦK

⎛ n ⎞ ( x + θ A ) Θ N Β( x + θ ) ⎡⎣1 − 1 F1 ( X N + Θ N , n + Θ, −γ ) ⎤⎦ =⎜ ⎟ A , ⎝ x ⎠ θ A ( X N + Θ N ) Β( θ ) ⎡⎣1 − 1 F1 ( Θ N ,Θ, −γ ) ⎤⎦

Text S1

(A8)

3

K ⎛ n⎞ 1 where XN is the number of alleles sampled in class N and ⎜ ⎟ = n!∏ . Note that ⎝ x⎠ i =1 xi !

Equations A7 and A8 are equivalent to Equations 12 and 13 in the main text, using the identities FA = 1 − F , X A = X N , and Θ A = Θ N .

References 1. Donnelly P, Nordborg M, Joyce P (2001) Likelihoods and simulation methods for a class of nonneutral population genetics models. Genetics 159: 853-867. 2. Erdéyli A (1939) Integration of certain systems of linear partial differential equations of hypergeometric type. Proc Roy Soc Edin A 59: 224-241. 3. Phillips PCB (1988) The characteristic function of the Dirichlet and multivariate F distributions. Cowles Foundation Discussion Paper No. 865.

Text S1

4