Microbial Genomics

2 downloads 0 Views 1MB Size Report
Helen Bernard. Simon le ..... Deng, X., Desai, P.T., den Bakker, H.C., Mikoleit, M., Tolar, B., Trees, E., Hendriksen, R.S., Frye, J.G.,. 347 .... Lane, C.R., LeBaigue, S., Esan, O.B., Awofisyo, A.A., Adams, N.L., Fisher, I.S.T., Grant, K.A., Peters,. 392.
Microbial Genomics Phylogenetic structure of European Salmonella Enteritidis outbreak correlates with national and international egg distribution network --Manuscript Draft-Manuscript Number:

MGEN-D-16-00014R3

Full Title:

Phylogenetic structure of European Salmonella Enteritidis outbreak correlates with national and international egg distribution network

Article Type:

Short Paper

Section/Category:

Microbial evolution and epidemiology: Communicable disease genomics

Corresponding Author:

Tim Dallman Public Health England UNITED KINGDOM

First Author:

Tim Dallman

Order of Authors:

Tim Dallman Thomas Inns Thibaut Jombart Philip Ashton Nicolas Loman Carol Chatt Ute Messelhaeusser Wolfgang Rabsch Sandra Simon Sergejs Nikisins Helen Bernard Simon le Hello Nathalie Jourdan da-Silva Christian Kornschober Joel Mossong Peter Hawkey Elizabeth de Pinna Kathie Grant Paul Cleary

Abstract:

Outbreaks of Salmonella Enteritidis have long been associated with contaminated poultry and eggs. In the Summer of 2014 a large multi-national outbreak of Salmonella Enteritidis phage type 14b occurred with over 350 cases reported in the United Kingdom, Germany, Austria, France and Luxembourg. Egg supply network investigation and microbiological sampling identified the source to be a Bavarian egg producer. As part of the international investigation into the outbreak, over 400 isolates were sequenced including isolates from cases, implicated UK premises, and eggs from the suspected source producer. We were able to show a clear statistical correlation between the topology of the UK egg distribution network and the phylogenetic network of outbreak isolates. This correlation can most plausibly be explained by different parts of the egg distribution network being supplied by eggs solely from independent premises of the Bavarian egg producer (Company X). Microbiological sampling from the source premises, traceback information and information on the interventions carried out at the egg production premises all supported this conclusion. The level of Downloaded from www.microbiologyresearch.org by 104.249.141.12 Powered by Editorial Manager® and IP: ProduXion Manager® from Aries Systems Corporation On: Fri, 27 May 2016 13:52:51

insight into the outbreak epidemiology provided by WGS would not have been possible using traditional microbial typing methods.

Downloaded from www.microbiologyresearch.org by 104.249.141.12 Powered by Editorial Manager® and IP: ProduXion Manager® from Aries Systems Corporation On: Fri, 27 May 2016 13:52:51

Manuscript Including References (Word document)

Click here to download Manuscript Including References (Word document) MGen_dallman_short_revision.docx

Short Paper template

1 2 3

Phylogenetic structure of European Salmonella Enteritidis outbreak correlates with national and international egg distribution network

4 5 6

ABSTRACT

7 8

Outbreaks of Salmonella Enteritidis have long been associated with contaminated poultry and eggs.

9

In the summer of 2014 a large multi-national outbreak of Salmonella Enteritidis phage type 14b

10

occurred with over 350 cases reported in the United Kingdom, Germany, Austria, France and

11

Luxembourg. Egg supply network investigation and microbiological sampling identified the source to

12

be a Bavarian egg producer. As part of the international investigation into the outbreak, over 400

13

isolates were sequenced including isolates from cases, implicated UK premises, and eggs from the

14

suspected source producer. We were able to show a clear statistical correlation between the

15

topology of the UK egg distribution network and the phylogenetic network of outbreak isolates. This

16

correlation can most plausibly be explained by different parts of the egg distribution network being

17

supplied by eggs solely from independent premises of the Bavarian egg producer (Company X).

18

Microbiological sampling from the source premises, traceback information and information on the

19

interventions carried out at the egg production premises all supported this conclusion. The level of

20

insight into the outbreak epidemiology provided by Whole Genome Sequencing (WGS) would not

21

have been possible using traditional microbial typing methods.

22 23 24

DATA SUMMARY

25 26

FASTQ sequences were deposited in the NCBI Short Read Archive under the BioProject PRJNA248792

27

(http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA248792)

28

Supplementary material is available at the following git repository

Downloaded from www.microbiologyresearch.org by IP: 104.249.141.12 On: Fri, 27 May 2016 13:52:51

29 30

https://github.com/timdallman/sent_14b.git

31 32 33

I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. 

34



35 36 37

IMPACT STATEMENT

38 39

In this article we show how the phylogenetic relationships between isolates in a foodborne outbreak

40

can be informative in revealing underlying epidemiological trends.

41 42

We were able to show a clear statistical correlation between the topology of the UK cases egg

43

distribution network and the phylogenetic network of outbreak isolates. This indicated that the

44

phylogeny clustered into distinct clades related to the source of eggs.

45 46

This study shows the benefit of whole genome sequencing of pathogens in revealing the true

47

epidemiology behind an outbreak allowing inference about source diversity and food-chain

48

contamination.

49 50 51

INTRODUCTION

52 53

Salmonella enterica serovar Enteritidis outbreaks in humans are often linked to contaminated

54

foodstuffs produced by the poultry industry (Harker et al., 2014; Lane et al., 2014). The incidence of

55

Salmonella Enteritidis in the United Kingdom and in Europe has decreased significantly following the

56

implementation of a vaccination program and other control measures in chicken flocks(Poirier et al.,

57

2008), but outbreaks associated with contaminated eggs continue to occur(Hugas and Beloeil, 2014).

58

Downloaded from www.microbiologyresearch.org by IP: 104.249.141.12 On: Fri, 27 May 2016 13:52:51

59

In 2014 a large multi-national outbreak of Salmonella Enteritidis phage type 14b was linked to

60

consumption of eggs (Inns et al., 2015). Over 350 cases were reported in the United Kingdom,

61

Germany, Austria, France and Luxembourg. Egg supply network investigation and microbiological

62

sampling identified the source to be a Bavarian egg producer.

63 64

Whole genome sequencing (WGS) is increasingly used for surveillance of food-borne

65

pathogens(Dallman et al., 2015b; Quick et al., 2015)(Jenkins et al., 2015) and prospective typing of

66

Salmonella Enteritidis isolates can identify possible outbreaks in real-time(den Bakker et al., 2014;

67

Deng et al., 2014). S. Enteritidis outbreaks can be investigated using WGS as part of the case

68

definition, as strains of a given outbreak are typically monophyletic (i.e. representing a single

69

evolutionary pathway), with limited diversity between outbreak isolates(Deng et al., 2014)(Wuyts et

70

al., 2015)(Taylor et al., 2015). WGS has however revealed significant polyclonal contamination

71

within chicken production farms (Allard et al., 2013).

72 73

Phylogenetic methods that explore the relationships between microbial genomes have been used to

74

study the emergence(Dallman et al., 2015a; Holt et al., 2012), geographical diffusion(Baker et al.,

75

2015; He et al., 2013) and transmission of infections(Bryant et al., 2013; Eyre et al., 2013).

76

Phylogenetic topologies can also be informative in terms of source attribution in outbreak

77

investigations(Harris et al., 2013). In this study we explore the relationships between the observed

78

phylogeny of 400 clinical, environmental and food isolates obtained in this outbreak and the

79

distribution network of the implicated foodstuff, as ascertained by national and international

80

traceback investigations.

81 82 83

METHODS

84 85

Strains

86 87

Since April 2014 all presumptive isolates of Salmonella in England and Wales received by the

88

Gastrointestinal Bacteria Reference Unit at Public Health England (PHE) have undergone WGS. As of

89

1st August 2015, 3,844 sequences of Salmonella enterica serovar Enteritidis had been analysed for

Downloaded from www.microbiologyresearch.org by IP: 104.249.141.12 On: Fri, 27 May 2016 13:52:51

90

routine surveillance purposes. Forty-four isolates from Germany, France, Austria and Luxembourg

91

were sequenced as part of this investigation with the inclusion criteria as follows; clinical isolates

92

belonging to phage type 14b, clinical isolates with matching multi-locus variable number tandem

93

repeat analysis (MLVA) profile 2-12-7-3-2, and implicated food or environmental samples. Of all the

94

isolates sequenced from the UK and mainland Europe, 401 strains of Salmonella Enteritidis matched

95

the outbreak SNP address 1.2.3.38.38.38 (Ashton et al., 2015), these included all the isolates

96

previously described by Inns et al. (2015). The strains are described in supplementary table 1.

97 98

Food chain investigations

99 100

Rapid Alert System for Food and Feed (RASFF) notifications were issued on 09 July 2014 (France), 31

101

July 2014 (Austria) and 1 August 2014 (France), which linked S. Enteritidis outbreaks in France and

102

Austria to chicken eggs from Company X in Germany. Company X had four separate premises, three

103

in Germany and one in Czech Republic; all are operationally independent. All four sites used young

104

chickens (pullets) from two locations: one in Germany and one in the Czech Republic. Food supply

105

network investigations involved obtaining information on the supply of eggs from Company X to UK

106

distributors and tracing onward supply to other UK companies. In addition, supply network

107

investigations were conducted in England to trace supplies of chicken and chicken eggs consumed by

108

cases to their source as described in Inns et al. (2015). In total, 198 of the 287 (69%) confirmed UK

109

cases could be plausibly linked to eggs supplied by one company, Company X, with no traceback

110

information available for the other cases.

111 112

Sequencing

113 114

Sequencing was performed by the PHE Genome Sequencing Unit using Nextera library preparation

115

on the Illumina HiSeq 2500 run in fast mode according to the manufacturers’ instructions, which

116

yielded 2 X 100 base pair paired end reads. At the Robert Koch Institute, University of Birmingham

117

and Institut Pasteur, libraries from genomic DNA were created using the Nextera library preparation

118

kit and subsequently run on the Illumina MiSeq sequencer using Illumina’s v3 Reagent Kit to produce

119

2 x 300 base pair paired end reads.

120

Downloaded from www.microbiologyresearch.org by IP: 104.249.141.12 On: Fri, 27 May 2016 13:52:51

121

High quality Illumina reads were mapped to the Salmonella enterica Enteritidis reference genome

122

(GenBank:AM933172) using BWA-MEM(Li and Durbin, 2010). Single nucleotide polymorphisms

123

(SNPs) were then identified using GATK2(McKenna et al., 2010) in unified genotyper mode. Core

124

genome positions that had a high quality SNP (>90% consensus, minimum depth 10x, GQ >= 30) in at

125

least one strain were extracted and RaxML v8.17 (Stamatakis, 2014) used to derive the maximum

126

likelihood phylogeny of the isolates under the GTRCAT model of evolution. Single linkage SNP

127

clustering was performed as previously described (Ashton et al., 2015). FASTQ reads from all

128

sequences in this study can be found at the PHE Pathogens BioProject at the National Center for

129

Biotechnology Information (Accession PRJNA248792).

130 131

Timed phylogenies

132 133

Timed phylogenies were constructed using BEAST-MCMC v1.80(Drummond et al., 2012) after first

134

removing regions of the genome predicted to have undergone recombination using Gubbins

135

v1.3(Croucher et al., 2015). Alternative clock models and population priors were computed and

136

assessed based on Bayes Factor (BF) tests using Tracer v1.6. The highest supported model was a

137

relaxed lognormal clock rate under a constant population size. All models were run with a chain

138

length of 1 billion. A maximum clade credibility tree was constructed using TreeAnnotator

139

v1.75(Drummond et al., 2012).

140 141

Network Comparison

142 143

Distances on the distribution network were measured as the number of intermediate nodes on the

144

shortest path between a pair of cases. For instance, the distance between two cases infected in the

145

same restaurant was one. The genetic distance between isolates of two different cases was

146

measured as the patristic distances between the corresponding tips of the phylogeny. A Monte-

147

Carlo Mantel test (Mantel, 1967) was used to investigate the degree of correlation between the

148

supply network and genetic distance matrices, using 167 cases that were both sequenced and

149

documented on the distribution network, with 9,999 random permutations of the data. Because the

150

relationship may be driven by cases infected by the same source, we also tested correlations

151

between distances based on pairs of cases with different sources of infection only.

Downloaded from www.microbiologyresearch.org by IP: 104.249.141.12 On: Fri, 27 May 2016 13:52:51

152 153

In addition, we accounted for the potential bias stemming from the existence of distinct genetic

154

clades in the sampled isolates using a partial Mantel test (Legendre and Legendre, 2012). In this

155

analysis, a linear regression is used to predict patristic distances as a function of a binary clade

156

membership distance (0: same clade; 1: different clades) and distances on the food distribution

157

network. Effects of each covariate, and their interaction, was tested using the classical ANOVA

158

framework (Legendre and Legendre, 2012).

159 160

All analyses were carried out in the R software(Team, 2014), using the packages igraph(Csardi and

161

Nepusz, 2006) for graph distances, adephylo (Jombart et al., 2010) for phylogenetic distances

162

computations and ade4(Dray et al., 2007, p. 4) for the Mantel test.

163 164 165

RESULTS

166 167

The phylogeny of 401 isolates implicated in the outbreak resolved into three clades all supported

168

with bootstraps > 95 (Figure1a). The isolates formed a single 5 SNP single linkage cluster with a

169

maximum distance between any two genomes of 23 SNPs. Within England, the outbreak consisted

170

of five point source, geographically distinct incidents and 101 sporadic cases. Isolates from the point

171

source outbreaks resolved into distinct sub-clades in the outbreak phylogeny with a maximum SNP

172

distance of 2 (mode of 0). Supply network information plausibly linked 198 cases in England to eggs

173

supplied by one company, Company X (Figure1b). Multiple trace-back pathways to the implicated

174

source of infection were identified (Inns et al., 2015).

175 176

According to the Monte-Carlo Mantel test, genetic distances between isolates significantly increased

177

with the distance (number of intermediate nodes) on the distribution network (r=0.62,p=0.0001).

178

This relationship remained significant when considering only pairs of cases from different direct

179

exposures (r=0.46; t-test: p=2.2x10-16) and was robust to non-linearities between distances

180

(Spearman rho=0.60, p=2.2x10-16). This relationship also remained significant when accounting for

181

the existence of distinct genetic clades (ANOVA: F=18,023; p