Microbial Genomics

2 downloads 0 Views 8MB Size Report
Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed ..... africanum lineage 6, M. mungi, M. orygis, M. bovis, etc.) due to two ...
Microbial Genomics Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: "Aethiops vetus" --Manuscript Draft-Manuscript Number:

MGEN-D-16-00001R1

Full Title:

Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: "Aethiops vetus"

Article Type:

Research Paper

Section/Category:

Microbial evolution and epidemiology: Phylogeography

Corresponding Author:

Hanna Nebenzahl-Guimaraes, MPH Rijksinstituut voor Volksgezondheid en Milieu NETHERLANDS

First Author:

Hanna Nebenzahl-Guimaraes, MPH

Order of Authors:

Hanna Nebenzahl-Guimaraes, MPH Solomon A Yimer Carol Holm-Hansen Jessica de Beer Roland Brosch Dick van Soolingen

Abstract:

Introduction: Lineage 7 of the Mycobacterium tuberculosis complex has recently been identified among strains originating from Ethiopia. Using different DNA typing techniques, this study provides additional information on the genetic heterogeneity of five lineage 7 strains collected in the Amhara Region of Ethiopia. Case Presentation: The results confirm the phylogenetic positioning of these strains between the ancient Lineage 1 and TbD1-deleted, modern Lineages 2, 3 and 4 of Mycobacterium tuberculosis. Four newly identified large sequence polymorphisms characteristic of the Amhara Region lineage 7 strains are described. While lineage 7 strains have been previously identified in the Woldiya area, we show that lineage 7 strains circulate in other parts of the Amhara Region and also among foreign-born individuals from Eritrea and Somalia in The Netherlands. Conclusion: For ease of documenting future identification of these strains in other geographical locations and recognizing the place of origin, we propose to assign lineage 7 strains the lineage name: "Aethiops vetus".

Downloaded from www.microbiologyresearch.org by 204.44.112.220 Powered by Editorial Manager® and IP: ProduXion Manager® from Aries Systems Corporation On: Mon, 16 May 2016 06:39:41

Manuscript Including References (Word document)

1 2 3

Click here to download Manuscript Including References (Word document) Microbial_Genomics_Revised_Manuscript.docx

Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: “Aethiops vetus”

4 5 6 7 8

ABSTRACT

9

Lineage 7 of the Mycobacterium tuberculosis complex has recently been

10

identified among strains originating from Ethiopia. Using different DNA typing

11

techniques, this study provides additional information on the genetic

12

heterogeneity of five lineage 7 strains collected in the Amhara Region of

13

Ethiopia. It also confirms the phylogenetic positioning of these strains between

14

the ancient Lineage 1 and TbD1-deleted, modern Lineages 2, 3 and 4 of

15

Mycobacterium

16

polymorphisms characteristic of the Amhara Region lineage 7 strains are

17

described. While lineage 7 strains have been previously identified in the Woldiya

18

area, we show that lineage 7 strains circulate in other parts of the Amhara

19

Region and also among foreign-born individuals from Eritrea and Somalia in The

20

Netherlands. For ease of documenting future identification of these strains in

21

other geographical locations and recognizing the place of origin, we propose to

22

assign lineage 7 strains the lineage name: “Aethiops vetus”.

tuberculosis.

Four

newly

identified

large

sequence

23 24 25

DATA SUMMARY

26 27 28

Reads and assembled scaffolds have been deposited in the European Nucleotide Archive (study accession number: PRJEB8432). (url http://www.ebi.ac.uk/ena/data/view/PRJEB8432)

29 1 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

30 31 32

I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ☒

33 34 35

IMPACT STATEMENT

36 37 38 39 40 41 42 43 44 45 46 47 48 49

This article communicates novel insights gained from genomic data on the new lineage 7 lineage of Mycobacterium tuberculosis (Mtb), namely to describe its positioning relative to other Mtb strains and specific variations (SNPs, insertions and deletions) that distinguish it from other lineages and (sub) species in the complex. This novel information will contribute towards the burgeoning awareness of this lineage that is of considerable evolutionary interest due to its intermediary positioning between ancient and modern lineages. Further investigation of differences in virulence and immunogenicity among Mtb major lineages, including lineage 7, is warranted to elucidate the pathogenesis and clinical consequences of TB disease. For ease of documenting the global epidemiology and anticipating future trends, we propose to name the lineage Aethiops vetus, which reflects the ancient origin of these strains in the Horn of Africa.

50 51 52

INTRODUCTION

53 54

A new lineage within the Mycobacterium tuberculosis complex (MTBC) has

55

recently been identified among strains originating from Ethiopia (Firdessa et al.

56

2013; Blouin et al. 2012). This lineage, named lineage 7, is of considerable

57

interest regarding evolutionary research as it represents a phylogenetic branch

58

intermediate between the ancient and modern lineages of Mycobacterium

59

tuberculosis (Mtb). Further characterization of this lineage may provide insight

60

as to where, when and how Mtb evolved, as well as what influence human hosts

61

had on the development of these bacteria. This is important for a better

62

understanding of Mtb adaptation to the human host, and the highly developed

63

ability of the bacterium to evade the immune system and spread extensively in 2 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

64

the human population. Characterization is thus essential to elucidate

65

pathogenicity, which underlies vaccine development, to document the global

66

epidemiology of tuberculosis (TB), and anticipate future trends in the spread of

67

the disease.

68 69

In this study we explore the positioning of this new lineage relative to other Mtb

70

strains through different DNA typing techniques, and identify the variations,

71

large sequence polymorphisms (LSPs) and smaller deletions, which distinguish

72

it from other groupings in the complex. In addition, we describe lineage 7 strains

73

detected in The Netherlands where the majority of the patients are foreign-born.

74

Lastly, to document the place where lineage 7 strains have been identified and

75

possibly originate, we suggest naming these strains ‘Aethiops vetus’.

76 77 78

METHODS

79 80

Six MTBC isolates were cultivated from sputum samples collected from newly

81

diagnosed pulmonary TB (PTB) patients attending selected health centers and

82

hospitals in various districts of Amhara Region, Ethiopia. The isolates were part

83

of 237 Mtb isolates collected between 2008 and 2010 for the TB Rapid Test

84

project at the Norwegian Institute of Public Health (NIPH). Bacteriological and

85

molecular characterization of the strains was performed in Ethiopia, Norway and

86

The Netherlands. Prior to sputum collection, socio-demographic and clinical

87

data were obtained using a structured questionnaire. HIV testing was also

88

performed on blood samples obtained from study participants following the

89

routine procedure in Ethiopia.

90 91

Sputum smear microscopy, culture and drug susceptibility testing

92

Three consecutive sputum samples, spot-morning-spot (Rao 2009), were

93

provided by study participants. Smear microscopy using Ziehl-Neelsen staining 3 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

94

was performed to identify acid fast bacilli. Pooled sputum samples were

95

transported to the Armauer Hansen Research Institute (AHRI) in Addis Abeba,

96

Ethiopia for culture. Sputum culture was performed using the conventional

97

Löwenstein–Jensen (LJ) culture media method (Christie & Callihan 1995). Drug

98

susceptibility testing (DST) was conducted at the Ethiopian Health and Nutrition

99

Research Institute (ENHRI), the national TB reference laboratory in Addis

100

Abeba, and at AHRI. DST was performed by the proportional absolute

101

concentration technique (Van Klingeren et al. 2007) and the MGIT

102

Mycobacterium

103

manufacturer’s instructions (Siddiqi, Rüsch-Gerdes 2006).

Growth

Indicator

Tube

(MGIT)

method

following

the

104 105

DNA extraction and spoligotyping

106

Deoxyribonucleic acid (DNA) was extracted from Mtb isolates according to

107

standard procedure (Zwadyk et al. 1994). Species identification by polymerase

108

chain reaction (PCR) targeting the RD9 (region of difference 9) was performed

109

as previously described (Parsons et al. 2002). Genomic regions of difference

110

(RDs) denote large deletions spanning several genes and are used for the

111

differentiation of members of the MTBC (Warren et al. 2006).

112

DNA of the Mtb isolates was subjected to spoligotyping at AHRI according to

113

Kamerbeek et al. 1997, and the results and DNA aliquots were sent to the

114

Tuberculosis Reference Laboratory at the National Institute for Public Health

115

and the Environment (RIVM) in The Netherlands for quality control. The results

116

indicated that the spoligotyping performed at AHRI was in agreement with the

117

findings at RIVM (Yimer et al. 2013). The International Spoligotype Database

118

(SpolDB4, 2011) was used to assign the spoligotyping patterns to major genetic

119

lineages and sub-lineages (Brudey et al. 2006). New spoligo patterns that were

120

not described by SpolDB4 were assigned to Mtb families by SpolDB3 and a

121

randomly initialized model (Run SPOTCLUST, 2011).

122 123

MIRU-VNTR typing

4 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

124

Mycobacterial interspersed repetitive units (MIRU) variable number of tandem

125

repeats (VNTR) typing was performed at NIPH in Oslo, Norway to improve the

126

discrimination of Mtb isolates and to compare the results with the extended

127

collection of VNTR profiles of strains collected within The Netherlands. VNTR is

128

based on the identification of 24 of the most polymorphic microsatellite

129

sequences (55-72 nucleotides in length) in the Mtb genome (Supply et al. 2006).

130

The findings are presented as a set of digits that indicate the number of tandem

131

repeat sequences at each of the 24 loci (Alonso-Rodríguez et al. 2008).

132 133

Sequencing, assembly and variant calling

134

Among the six isolates transferred to RIVM for whole genome sequencing

135

(WGS), one (Solo 80) could not be sub cultured despite repeated attempts. DNA

136

was thus extracted from five isolates using standard methods. WGS was

137

successfully performed by BaseClear BV (Leiden, The Netherlands) on an

138

Illumina HiSeq 2500 instrument using reads of 50 bp in length in the paired-end

139

modus. The average genome coverage was approximately 60-fold. The FASTQ

140

sequence reads were generated using the Illumina Casava pipeline version

141

1.8.3. Initial quality assessment was based on data passing the Illumina Chastity

142

filtering. Subsequently, reads containing adapters and/or PhiX control signals

143

were removed using an in-house filtering protocol. The second quality

144

assessment was based on the remaining reads using the FASTQC quality

145

control tool version 0.10.0. The quality of the FASTQ sequences was enhanced

146

by trimming off low-quality bases using the “Trim sequences” option of the CLC

147

Genomics Workbench version 6.5. The quality-filtered FASTQ sequences were

148

assembled using Ray, and the resulting contigs were mapped to both H37Rv

149

(GenBank accession NC_000962.3) and M. africanum (GenBank accession

150

FR878060.1). In parallel, single nucleotide polymorphisms (SNPs) and INDELS

151

(insertions or deletions) were called using Breseq software (version 0.23) with a

152

minimum depth of 15x (Barrick et al. 2009). SNPs with low-quality evidence (i.e.

153

possible mixed read alignment) or within 5 base pairs (bp) of an INDEL

154

(insertion or deletion) were discarded. Reads and assembled scaffolds are

155

deposited in the European Nucleotide Archive (study accession number: 5 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

156

PRJEB8432).

157 158

Phylogenetic tree construction and comparison of sequences

159

A parsimony-based phylogeny (Phylip, version 3.695) of the MTBC was

160

constructed using 78,613 concatenated variable positions (SNPs) that were

161

found in at least one of 100 previously characterized RIVM strains of different

162

lineages and the five strains from Ethiopia in relation to the outgroup strain

163

Mycobaterium canettii CIPT 140070010 (strain STB-K - GenBank accession

164

NC_019951.1). This strain was previously determined to be the most

165

genomically distant relative to the core members of the MTBC (Supply et al.,

166

2013; Blouin et al., 2014; Boritsch et al., 2016). Two additional lineage 7 strains

167

(Percy 256 and Percy 556) from an independent study were added to this tree

168

(Blouin et al. 2012). The Artemis Comparison Tool (ACT) (Carver et al, 2008)

169

was used to perform pairwise comparison between the different Ethiopian

170

sequences, Mtb H37Rv and M. africanum.

171 172 173

RESULTS

174 175

Socio-demographic information

176

The five sequenced isolates were obtained from two female and three male

177

study participants ranging in age from 15 to 50 years (mean age 32 years). All

178

patients came from rural areas from 5 different districts of the Amhara Region

179

(Fig. 1).

180 181

Lineage 7 strains detected in The Netherlands

182

Isolates from three patients, two males and one female, originating from

183

Somalia, Ethiopia and Eritrea, were previously collected in the Netherlands and

184

identified as belonging to the new lineage 7. The three lineage 7 isolates were 6 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

185

exceptional among the 3,867 unique MIRU profiles stored in The Netherlands

186

between 1993 and 2011. During this period, approximately 72% of the patients

187

were of foreign origin; 49% from Africa, 32% from Asia, and 11% from North or

188

South America. Two of the lineage 7 strains were isolated in 2008, one in 2009,

189

and all were sensitive to the 5 first-line antibiotics. Bacilli were isolated from

190

positive sputum samples and the patients had only pulmonary manifestation.

191

Contact investigation was performed for one case and one positive contact was

192

identified (confirmed by thorax X-ray). All three were treated with the standard

193

chemotherapeutic regimen and had a successful outcome.

194 195

VNTR characterization of lineage 7 strains

196

The 24-locus VNTR typing patterns (the complete variation of patterns) of the

197

eight lineage 7 strains (five isolated in Ethiopia and three in the Netherlands) are

198

presented in Table 1. The similarity in VNTR patterns between the three lineage

199

7 strains found in The Netherlands with two of the present lineage 7 strains from

200

Ethiopia ranged from 79.2% to 95.8% (average pairwise similarity of 86.1%).

201 202

SNP characterization of five Ethiopian strains

203

The five Ethiopian strains shared an average of 940 SNPs (range 888-986) and

204

differed by an average of 183 SNPs (range 113-265) by pairwise comparisons.

205

In the SNP-based phylogenetic tree, the five Ethiopian isolates clustered into

206

one unique branch located between the ancient lineage 1 (East African Indian

207

strains) and modern lineages 2, 3, and 4 (Beijing, Central Asian Strain (CAS)

208

and Euro-American) (Fig. 2).

209 210

LSP characterization of five Ethiopian strains by presence/absence of RDs and

211

newly identified deletions

212

7 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

213

All five Ethiopian strains contain all the genomic regions that are absent from M.

214

bovis BCG Pasteur (RD1 through RD9), with the exception of the RD3 region

215

(corresponding to the prophage region PhiRv1) and RD11 (corresponding to the

216

prophage region PhiRv2) (Fig. 3). The RD6 region was not tested as it contains

217

Insertion Sequence (IS) elements. While absent in M. tuberculosis H37Rv, the

218

TbD1 region, RvD1-RvD5 and RD239 (or LSP 239) as described by Brosch et

219

al., 2002 and Gagneux et al. 2006 were conserved in the five Ethiopian strains.

220

Figure 3 visually summarizes these observations and shows the positioning of

221

the five Ethiopian strains within the MTBC.

222 223

Four additional specific deletions (indicated as Δ1–Δ4 in Fig. 3) were found

224

across all five strains: a 10bp deletion in the mmpl9 gene, a 27bp deletion in the

225

lppH gene (suppl. Fig. 1A & B), a 1.3kb fragment spanning genes lppO/sseB

226

(suppl. Fig. 2), and a fragment of 3.3kb spanning the genes rv3467/rmlB2/mhpE

227

(suppl. Fig. 3). All four deletions were also confirmed in the lineage 7 Percy256

228

strain (Blouin et al. 2012). Finally, in the region encompassing RD7 in Mtb

229

H37Rv, the five Ethiopian strains present an extra 4.4kb fragment that is absent

230

from Mtb lineage 4 strains as well from all RD7-deleted MTBC members (M.

231

africanum lineage 6, M. mungi, M. orygis, M. bovis, etc.) due to two independent

232

deletion events (suppl. Fig. 4). However, this region seems to be present in all

233

other tubercle bacilli (suppl. Fig. 4), including the M. canettii strains, in which the

234

genes

235

MCAN_19951, MCAN_19961 (Supply et al. 2013). Similarly, another genomic

236

region of approximately 5kb in the proximity of the pknH gene was present in the

237

five Ethiopian strains and in M. canettii. This region is absent from Mtb H37Rv

238

(suppl. Fig. 5) and many other Mtb strains of the modern lineages. An overview

239

of what is known to date about the function of the genes/regions where these

240

specific deletions take place is presented in suppl. Table 1. All additional smaller

241

INDELs found amongst the five Ethiopian strains, mostly of one base pair, are

242

listed in suppl. Table 2.

correspond

to

the

orthologs

of

MCAN_19931,

MCAN_19941,

243 244 8 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

245

DISCUSSION

246 247

Mycobacterium tuberculosis, a member of the MTBC, has been a significant

248

cause of morbidity and mortality among humans for thousands of years.

249

Phylogenetic studies have shown that the MTBC comprises at least seven

250

distinct lineages distributed in different geographical locations around the world.

251

Current evidence indicates that the Horn of Africa is the most likely origin of

252

MTBC from which it has spread to other parts of the world (Comas et al. 2013).

253 254

Ethiopia is a country in the Horn of Africa with an estimated TB incidence of 224

255

cases/100,000 population (Global Tuberculosis Report, 2014). Molecular typing

256

of isolates from various regions indicates that all major MTBC lineages are

257

represented among TB patients in Ethiopia. A recent study conducted in the

258

Amhara Region of Ethiopia identified a high proportion of “Unknown” spoligotype

259

patterns (SIT910 and SIT1729) in addition to strains not previously described in

260

the database (Yimer et al. 2013). These strains have since been included in the

261

recently described Mtb lineage 7.

262 263

The current study focused on the further characterization of five lineage 7

264

strains isolated in different parts of the Amhara Region in Ethiopia, the second

265

most populous region in the country with an estimated 20 million inhabitants.

266

One of the strains matched the SIT910 pattern of lineage 7 Percy556 from

267

Djibouti (Blouin et al. 2012), whilst another matched SIT1729. The results

268

indicate that lineage 7 strains form a separate group as demonstrated by VNTR

269

and SNP typing (Van Klingeren et al. 2007). Both the analysis of RDs and SNPs

270

in our study confirms the positioning of lineage 7 strains between ancient and

271

modern lineages, as proposed by the SNP tree in Firdessa et al. 2013 and

272

Comas et al. 2015. The inclusion of the phylogenetically most distant M. canettii

273

strain STB-K (CIPT 1400700100) (Supply et al., 2013; Blouin et al., 2014;

274

Boritsch et al., 2016) as the outgroup has reinforced the clusters observed

275

among the different Mtb. 9 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

276 277

We found four newly identified deletions characteristic of all five lineage 7

278

strains from the Amhara Region and the Percy256 strain. Our study also

279

provides additional information on the genetic heterogeneity of Mtb lineage 7

280

strains. The strains in our dataset were found to be more heterogeneous

281

(average of 183 SNPs ranging from 113 to 265) than the two Percy (256 and

282

556) strains described in Blouin et al. 2012 that differed by only 8 SNPs. While

283

Firdessa et al. 2013 identified lineage 7 strains in the Woldiya area, we showed

284

that lineage 7 strains also circulate in other parts of the Amhara Region and

285

among foreign-born patients from the Horn of Africa residing in The

286

Netherlands.

287 288

There is a growing interest in conducting phylogeographic studies on MTBC

289

isolates and it is likely that lineage 7 strains may be identified in other parts of

290

the world. Mtb lineage 7 strains have previously been reported in Ethiopia

291

(Firdessa et al. 2013; Yimer et al. 2013, Yimer et al. 2015) and among Ethiopian

292

immigrants in Djibouti (Blouin et al. 2012). However, a specific designation that

293

can distinguish these particular strains from other sub lineages has not been

294

proposed. It might therefore be opportune to designate this lineage “Aethiops

295

vetus”. This name reflects the ancient origin of the new strains identified to date

296

in Ethiopia and among Ethiopian immigrants abroad.

297 298

This geographical restriction is also in accordance with the low representation of

299

lineage 7 strains in The Netherlands. Between 1993 and 2011 there have been

300

2,455 TB cases among foreign-born patients originating from the Horn of Africa

301

in The Netherlands; approximately a third of the strains belong to the East

302

African Indian (EAI) lineage and the other two thirds are either CAS or Euro-

303

American.

304 305

Previous studies showed that the genomic diversity of MTBC has a significant

306

impact on virulence and disease presentation among TB patients (Manca et al. 10 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

307

1999; López-Granados et al. 2003; Hernandez Pando et al. 2010; Sreevatsan et

308

al. 1997). Further investigation of differences in virulence and immunogenicity

309

among MTBC major lineages, including lineage 7 from Ethiopia and the Horn of

310

Africa, is warranted to elucidate the pathogenesis, epidemiology and clinical

311

consequences on TB disease. These studies will provide a better understanding

312

of MTBC evolution and an indication of MTBC evolution under the current

313

treatment regimens.

314 315

Conclusion

316 317

The study confirms through different DNA typing techniques and whole genome

318

sequencing the phylogenetic positioning of Mtb lineage 7 strains between the

319

ancient lineage 1 and the TbD1-deleted, modern lineage 2, 3 and 4 strains. In

320

addition, we show that this new lineage is not widespread in The Netherlands

321

where the majority of TB patients are foreign-born. This may reflect that lineage

322

7 strains are not highly transmissible as compared to modern Mtb strains. To

323

recognize the place of origin we propose to designate the lineage 7 strains as

324

Aethiops vetus. Further studies are warranted to investigate the relevance of

325

this special Mtb strain lineage with respect to clinical disease and transmission

326

as compared to other members of MTBC in Ethiopia and the Horn of Africa.

327 328 329

ACKNOWLEDGEMENTS

330 331

This work was supported by the Research Council of Norway, GLOBVAC

332

programme (grant number192468/S50). RB acknowledges support from the

333

Fondation pour la Recherche Médicale (FRM) DEQ20130326471.

334 335

ABBREVIATIONS

336 11 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

337

MTBC: Mycobacterium tuberculosis complex

338

Mtb: Mycobacterium tuberculosis

339

TB: tuberculosis

340

LSPs: large sequence polymorphisms

341

DST: Drug susceptibility testing

342

MGIT: Mycobacterium Growth Indicator Tube

343

LJ: Löwenstein–Jensen

344 345

REFERENCES

346 347 348 349

Alonso-Rodríguez, N., Martinez-Lirola, M., Herranz, M., Sanchez-Benitez, M., Bouza, E. & de Viedma, DG. (2008). Evaluation of the new advanced 15-loci MIRU-VNTR genotyping tool in Mycobacterium tuberculosis molecular epidemiology studies. BMC microbiology 8, 34.

350 351 352

Barrick, J.E, Su Yu, DS., Yoon, SH., Jeong, H., Oh, TK., Schneider, D., Lenski, RE., Kim, JF. (2009). Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461(7268), 1243–1247.

353 354 355 356

Blouin, Y., Cazajous, G., Dehan, C., Soler, C., Vong, R., Hassan, MO., Hauck Y., Boulais, C., Andriamanantena, D., & other authors (2014). Progenitor “Mycobacterium canettii” clone responsible for lymph node tuberculosis epidemic, Djibouti. Emerging Infectious Diseases 20(1), 21–28.

357 358 359

Blouin, Y., Hauck, Y., Soler, C., Fabre, M., Vong, R., Dehan, C., Cazajous, C., Massoure PL., Kraemer, P., & other authors (2012). Significance of the Identification in the Horn of Africa of an Exceptionally Deep Branching Mycobacterium tuberculosis Clade. PLoS ONE, 7(12):e52841.

360 361 362 363

Brudey, K., Driscoll JR., Rigouts, L., Prodinger, WM., Gori, A., Al-Hajoj, SA., Allix, C., Aristimuno, L., Arora, J., & other authors (2006). Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification, population genetics and epidemiology. BMC microbiology, 6, 23.

364 365

Christie, J.D. & Callihan, D.R. (1995). The laboratory diagnosis of mycobacterial diseases. Challenges and common sense. Clinics in laboratory medicine, 15(2), 279–306.

366 367 368

Comas, I.., Coscolla, M., Luo, T., Borrell, S., Holt, KE., Kato-Maeda, M., Parkhill, J., Malla, B., Berg, S, & author authors (2013). Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nature genetics, 45(10), 1176–82.

369 370 371

Firdessa, R., Berg, S., Hailu, E., Schelling, E., Gumi, B., Erenso, G., Gadisa, E., Kiros, T., Habtamu, M., & other authors (2013). Mycobacterial lineages causing pulmonary and extrapulmonary tuberculosis, Ethiopia. Emerging Infectious Diseases 19(3), 460–463.

372 373

Hernandez Pando, R., Aguilar, D., Cohen, I., Guerrero, M., Ribon, W., Acosta, P., Orozco, H., Marquina, B., Salinas, C., & other authors (2010). Specific bacterial genotypes of

12 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

374 375

Mycobacterium tuberculosis cause extensive dissemination and brain infection in an experimental model. Tuberculosis, 90(4), 268–277.

376 377 378 379

Kamerbeek, J., Schouls, L., Kolk, A., van Agterveld, M., van Soolingen, D., Kuijper, S., Bunschoten, A., Molhuizen, H., Shaw, R., & other authors (1997). Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. Journal of Clinical Microbiology, 35(4), 907–914.

380 381 382 383

Van Klingeren, B., Dessens-Kroon, M., van der Laan, T., Kremer, K., van Soolingen, D. (2007). Drug susceptibility testing of Mycobacterium tuberculosis complex by use of a high-throughput, reproducible, absolute concentration method. Journal of Clinical Microbiology, 45(8), 2662– 2668.

384 385 386

López-Granados, E., Cambronero, R., Ferreira, A., Fontan, G., Garcia-Rodriguez, MC. (2003). Three novel mutations reflect the variety of defects causing phenotypically diverse X-linked hyper-IgM syndrome. Clinical and Experimental Immunology, 133(1), 123–131.

387 388 389 390

Manca, C., Tsenova, L., Barry, CE., Bergtold, A., Freeman, S., Haslett, PA., Musser, JM., Freedman, VH., Kaplan, G. (1999). Mycobacterium tuberculosis CDC1551 induces a more vigorous host response in vivo and in vitro, but is not more virulent than other clinical isolates. Journal of immunology (Baltimore, Md. : 1950), 162(11), 6740–6746.

391 392 393

Mariam, S.H. (2009). Interaction between lactic acid bacteria and Mycobacterium bovis in ethiopian fermented milk: Insight into the fate of M. bovis. Applied and Environmental Microbiology, 75(6), 1790–1792.

394 395 396 397

Parsons, L.M., Brosch, R., Cole, ST., Somoskovi, A., Loder, A., Bretzel, G., van Soolingen, D., Hale, YM., Salfinger, M. (2002). Rapid and simple approach for identification of Mycobacterium tuberculosis complex isolates by PCR-based genomic deletion analysis. Journal of Clinical Microbiology, 40(7), 2339–2345.

398 399 400

Rao, S. (2009). Sputum smear microscopy in DOTS: Are three samples necessary? An analysis and its implications in tuberculosis control. Lung India : official organ of Indian Chest Society, 26(1),3– 4.

401 402 403 404

Gagneux, S., DeRiemer, K., van T., Kato-Maeda, M., de Jong, BC., Narayanan, S., Nicol, M., Niemann, S., Kremer, K., & other authors (2006). Variable host – pathogen compatibility in Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences 103(8), 2869– 2873.

405 406

Siddiqi, SH., Rüsch-Gerdes, S., 2006. MGIT Procedure Manual. Geneva, Switzerland: Foundation for Innovative New Diagnostics. July, 2006.

407 408 409 410

Sreevatsan, S., Pan, X., Stockbauer, KE., Connell, KD., Kreiswirth, BN., Whittam, TS., Musser, JM. (1997). Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proceedings of the National Academy of Sciences of the United States of America, 94(18), 9869–9874.

411 412 413 414

Supply, P., Marceau, M., Mangenot, S., Roche, D., Rouanet, C., Khanna, V., Majlessi, L., Criscuolo, A., Tap, J. & other authors (2013). Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis. Nature genetics, 45(2), 172-9.

415 416

Supply, P., Allix, C., Lesjean, S., Cardoso-Oelemann, M., Rusch-Gerdes, S., Willery, E., Savine, E., de Haas, P., van Deutekom, H. & other authors (2006). Proposal for standardization of

13 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

417 418

optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. Journal of Clinical Microbiology, 44(12), 4498–4510.

419 420 421 422

Warren, RM., Gey Van Pittius, NC., Barnard, M., Hesseling, A., Engelke, E., De Kock, M., Gutierrez, MC., Victor, TC., Hoal, EG. & other authors (2006). Differentiation of Mycobacterium tuberculosis complex by PCR amplification of genomic regions of difference. International Journal of Tuberculosis and Lung Disease, 10(7): 818-822.

423 424 425 426

Yimer, S.A., Hailu, E., Derese, Y., Bjune, GA., Holm-Hansen, C. (2013). Spoligotyping of Mycobacterium tuberculosis isolates among pulmonary tuberculosis patients in Amhara Region, Ethiopia. APMIS : acta pathologica, microbiologica, et immunologica Scandinavica, 121(9), 878–85.

427 428 429

Zwadyk, P., Down, JA, Myers, N., Dey, MS. (1994). Rendering of mycobacteria safe for molecular diagnostic studies and development of a lysis method for strand displacement amplification and PCR. Journal of Clinical Microbiology, 32(9), 2140–2146.

430 431 432 433 434 435 436

DATA BIBLIOGRAPHY

437

Nebenzahl-Guimaraes, H. European Nucleotide Archive. PRJEB8432 (ERP009510). (2015).

438 439 440 441

FIGURES AND TABLES

442 443

Figure Legends

444 445

Fig. 1 Map of Ethiopia and the Amhara Region locating specific sites where the six

446

isolates were collected.

447

Fig. 2 Parsimony-based phylogeny of Mtb using 78,613 concatenated variable

448

positions (SNPs) found in at least one of 100 previously characterized RIVM 14 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

449

strains of different lineages called against M. canettii CIPT 140070010, the

450

genomically most distant strain among the M. canettii taxon (Supply et al., 2013;

451

Blouin et al., 2014; Boritsch et al., 2016),

452 453

Fig. 3 Phylogenetic lineage classification based on the presence or absence of Regions

454

of Difference (RD) and smaller deletions.

455 456

Table 1 24-locus MIRU-VNTR typing patterns of eight lineage 7 strains.

457 458

Supplementary Material Legends

459 460

Suppl. Fig. 1:

461

Aethiops vetus strains. Panel A shows the BLAST comparison of the nucleotide

462

sequence of gene mmpl9 from Aethiops vetus strain 262 used as query

463

sequence in comparison to the orthologous sequence of M. tuberculosis H37Rv

464

(sbjct). Panel B shows pairwise linear genomic comparisons of M. canettii,

465

Aethiops vetus, M. tuberculosis H37Rv, and M. bovis AF2122/97. Red lines

466

indicate collinear blocks of DNA-DNA similarity, whereas the white triangle

467

indicates a specific 27bp deletion in the orthologous lppH region of M.

468

tuberculosis lineage 7 Aethiops vetus.

Identification of small genomic deletions in the genomes of

469 470

Suppl. Fig. 2:

471

concerning the orthologues of genes lppO and sseB. Panel A shows the linear

472

genomic comparisons of M. canettii, M. tuberculosis strains Aethiops vetus,

473

H37Rv, and M. bovis AF2122/97. Red lines indicate collinear blocks of DNA-DNA

474

similarity, whereas the white triangles indicate the specific region absent from

475

Aethiops vetus/lineage 7 strains. Panel B shows the BLASTN results obtained

476

with the Aethiops vetus junction region of strain 262, used as query sequence,

Identification of a specific 1.3 kb deletion in Aethiops vetus,

15 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

477

which confirms the specific 1.3 kb deletion relative to M. tuberculosis H37Rv

478

(sbjct).

479 480

Suppl. Fig. 3:

481

Aethiops vetus strains concerning the orthologues of genes rv3467/rmlB2/mhpE.

482

Panel A shows the linear genomic comparisons of M. canettii, M. tuberculosis

483

Aethiops vetus, H37Rv, and M. bovis AF2122/97. Red lines indicate collinear

484

blocks of DNA-DNA similarity, whereas the white triangles indicate the specific

485

region absent from Aethiops vetus/lineage 7 strains. Panel B shows the BLASTN

486

results obtained with the Aethiops vetus junction region of strain 262, used as

487

query sequence, which confirms the specific 3,331 bp deletion relative to M.

488

tuberculosis H37Rv (sbjct).

Identification of a specific 3.3 kb deletion in M. tuberculosis

489 490

Suppl. Fig. 4:

491

Aethiops vetus, H37Rv, and M. bovis AF2122/97 sequences, covering the region

492

of difference RD7. Red lines indicate collinear blocks of DNA-DNA similarity,

493

whereas the white triangles indicate regions absent from selected strain lineages.

494

The figure shows that in M. canettii and Aethiops vetus the genomic region

495

encompassing orthologues of genes MCAN_19931-19961 are missing from M.

496

tuberculosis H37Rv and other lineage 4 strains. An overlapping portion of this

497

genomic region, located close to the terminus of replication region, was also

498

subject of an independent genomic deletion (RD7) in the progenitor of M.

499

africanum lineage 6 strains and animal strain lineages, including M. bovis.

Linear genomic comparison of M. canettii, M. tuberculosis

500 501

Suppl. Fig. 5:

502

Aethiops vetus, H37Rv, and M. bovis AF2122/97 sequences, covering the region

503

encoding for serine threonine kinase PknH. Red lines indicate collinear blocks of

504

DNA-DNA similarity, whereas the white triangles indicate regions absent from M.

505

tuberculosis H37Rv due to recombination of two pknH copies and deletion of the

506

adjacent gene (orthologue of MCAN_12811). Interestingly, the full pknH region

507

was also described as being present in M. africanum, which argues that the

Linear genomic comparison of M. canettii, M. tuberculosis

16 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

508

recombined region in M. bovis is due to an independent recombination event in

509

the M. africanum-M. bovis lineage, which is also suggested by the sequence

510

differences in the pknH gene between M. tuberculosis H37Rv and M. bovis, (as

511

indicated by a white block of distant similarity).

512 513

Suppl. Table 1: Additional smaller INDELs found amongst the 5 M. tuberculosis

514

lineage 7/Aethiops vetus strains (against reference strain H37Rv).

515 516

Suppl. Table 2: Overview of relevant functional studies on genes/regions where

517

LSPs/deletions specific to lineage 7 Aethiops vetus occur.

17 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Research paper template

518

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Figure 1

Click here to download Figure Figure_1.tif

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Figure 2

Click here to download Figure withPercys_Mar28 (1).png

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Figure 3

Click here to download Figure Figure_3.tif

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Table 1

Click here to download Table Table 1 copy.xls

Sequence IDIsolated in 68 The 838 Netherlands 54 229 230 232 Ethiopia 233 234

VNTR locus 580 2996 802 960 1644 3192 424 577 2165 2401 3690 4156 2163b 1955 4052 154 4 5 3 7 4 7 6 4 3 2 1 3 4 4 6 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 9 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 8 1 4 6 3 7 4 6 2 4 3 2 1 3 4 4 8 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 8 1 8 6 3 7 4 5 4 4 3 2 1 3 4 4 9 1 4 6 2 7 4 5 4 4 3 2 1 3 4 4 2 1 4 6 3 7 4 5 4 4 3 2 1 3 4 3 8 1

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

2531 4348 2059 2687 3007 2347 2461 3171 4 6 2 2 3 4 3 4 4 6 2 2 3 4 3 4 4 6 2 2 3 4 3 4 4 6 2 2 3 3 3 4 4 6 2 2 3 3 3 4 4 6 2 2 3 4 3 4 4 6 2 2 1 3 3 4 4 7 2 2 3 3 3 4

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Supplementary Material Files

Click here to download Supplementary Material Files Supplementary Figures and Tables - Mar22 (1).pdf

M.t uber cul osi sAet hi opsvet us

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

M.t uber cul osi sAet hi opsvet us

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

M.t uber cul osi sAet hi opsvet us

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

M.t uber cul osi sAet hi opsvet us

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

M.t uber cul osi sAet hi opsvet us

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Supplementary Table 1

Gene name

Rv number

Functional category

Function

mmpL9

Rv2339

Cell wall and cell processes

Unknown. Thought to be involved in fatty acid transport.

lppH lppO sseB Rv3467

Rv3576 Rv2290 Rv2291 Rv3467

Cell wall and cell processes Cell wall and cell processes Intermediary metabolism and respiration insertion seqs and phages

Unknown. Unknown. Unknown. Unknown.

galE1 (rmlB2) Rv3634c

Intermediary metabolism and respiration

Involved in galactofuranosyl biosynthesis.

mhpE

Rv3469c

Intermediary metabolism and respiration

Involved in aromatic hydrocarbons degradation

pknH

Rv1266c

Regulatory proteins

Involved in signal transduction (via phosphorylation).

Relevant findings Decrease in survival observed in mmpL9 mutant in hydrogen peroxide (Mestre O et al., 2013). MmpL-mediated lipid secretion contributes to innate ability of pathogen to survive intracellularly and contributes directly to host-pathogen dialogue that determines the ultimate outcome of infection (Domenech P., 2004).

Has been implicated in pathogenesis (Graham, JE et al., 1999 & McKinney, JD et al., 2000). rmlB2 genes observed to be upregulated in macrophages (Fontan, P., 2008), thus a putative bacterial pathogenic factor relevant for the intracellular survival of MTB. Its disruption in H37Rv confers a higher bacillary load (hypervirulence) during the chronic phase of infection in BALB/c mice (Papavinasasasundaram, KG et al., 2005). Its differential expression in response to stress conditions indicate its ability to regulate cellular events promoting bacterial adaptation to environmental change (Sharma, K., et al., 2004).

Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41

Supplementary Table 2 position 30,980 125,830 131,174 139,556 194,305 230,576 234,496 289,125 293,628 424,320 424,790 467,497 467,508 592,759 641,719 688,792 750,089 874,835 976,897 1,010,204 1,086,490 1,090,188 1,131,229 1,165,521 1,168,715 1,273,250 1,365,837 1,474,959 1,488,557 1,541,066 1,696,499 1,753,519 1,878,817 1,894,301 2,001,789 2,090,401 2,109,523 2,133,474 2,137,522 2,143,523 2,161,343 2,207,591 2,218,577 2,342,650 2,358,029 2,421,975 2,523,205 2,536,625 2,564,368 2,606,797 2,632,341 2,704,885 2,960,944 3,131,472 3,194,241 3,293,700 3,473,996 3,539,139 3,590,686 3,610,391 3,723,901 3,794,867 3,801,551 3,819,797 4,170,964 4,197,138 4,277,517 4,306,873 4,337,820 4,341,396 4,358,979 4,360,680 4,383,144 34,511 77,818 105,175 137,851 154,179 173,080 208,317 289,070 323,029 364,499 373,283 374,762 384,541 428,084 479,982 597,146 843,931 854,253 925,343 952,811 1,218,649 1,340,653 1,385,074 1,415,281 1,502,753 1,546,822 1,561,664 1,655,680 1,677,116 1,714,132 1,972,840 1,978,239 1,992,324 2,058,119 2,094,912 2,127,928 2,183,462 2,189,574 2,192,436 2,214,655 2,239,633 2,339,611 2,357,269 2,368,565 2,387,186 2,448,884 2,453,896 2,525,723 2,534,563 2,684,690 2,724,202 2,798,719 2,803,090 2,805,386 2,881,598 2,957,569 3,020,370 3,190,146 3,331,362 3,381,237 3,415,181 3,426,069 3,580,637 3,627,387 3,683,269 3,734,179 3,742,992 3,827,445 3,844,757 3,862,473 3,874,723 3,890,100 4,035,996 4,095,002 4,095,297 4,169,667 4,189,684 4,198,612 4,307,622 4,338,596 4,359,229 4,386,640 4,400,661 3,925,504 2,133,348 2,881,582 3,111,343 1,026,917 2,469,918 3,120,177 4,375,354 1,002,282 1,103,437 2,566,766 2,782,084 3,907,771 4,359,186 737,747 943,981 1,543,532 1,691,287 1,820,293 3,378,653 3,907,467 4,031,612 4,032,430 4,081,108 4,387,690 357,261 3,966,640 791,029 1,338,795 1,541,079 37,927 691,888 2,960,896 3,119,884 3,687,568 3,734,936 3,905,040

mutation +CGAT +A +G +C +GG +T +GT +T +C +C +TA +G +G +A +C +G +C +CG +G +G +C +G +G +A +T +A +GG +CA +C +C +CGGT +C +G +10 bp +GG +12 bp +G +15 bp +10 bp +T +T +C +C +11 bp +G +GGAA +CGC +G +C +C +A +21 bp +C +9 bp +G +G +A +G +C +C +T +CA +T +CA +A +T +G +A +G +11 bp +G +A +CGGGG Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ53 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ15 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ3 bp Δ13 bp Δ9 bp Δ1 bp Δ1 bp Δ9 bp Δ3 bp Δ1 bp Δ1 bp Δ2 bp Δ1 bp Δ1 bp Δ6 bp Δ3 bp Δ1 bp Δ3 bp Δ1 bp Δ9 bp Δ1 bp Δ6 bp Δ3 bp Δ1 bp Δ5 bp Δ2 bp Δ1 bp Δ1 bp Δ2 bp Δ1 bp Δ20 bp Δ1 bp Δ18 bp Δ1 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ2 bp Δ1 bp Δ14 bp Δ6 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ3 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ12 bp Δ1 bp Δ9 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ5 bp +G +T +C Δ1 bp Δ1 bp Δ144 bp Δ1 bp +CC +A +G +27 bp +T +54 bp Δ5 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ7 bp Δ268 bp Δ3 bp Δ1 bp Δ12 bp +C +G Δ1 bp Δ3 bp Δ1 bp Δ1 bp Δ1 bp Δ18 bp Δ147 bp Δ1 bp Δ1 bp Δ3 bp

annotation coding (1259/1347 nt) coding (4712/4899 nt) intergenic (-70/-208) coding (1044/1161 nt) coding (634/795 nt) intergenic (+114/-323) coding (2266/2289 nt) coding (22/234 nt) intergenic (+135/+170) coding (375/426 nt) coding (9890/9903 nt) coding (505/543 nt) coding (494/543 nt) coding (1106/1131 nt) coding (1093/1716 nt) coding (761/828 nt) coding (90/1506 nt) coding (603/711 nt) coding (1307/1332 nt) coding (69/1599 nt) coding (856/1590 nt) intergenic (-13/-185) coding (102/294 nt) intergenic (-22/+260) coding (514/525 nt) coding (828/912 nt) intergenic (+29/-38) noncoding (1302/3138 nt) coding (1409/1812 nt) coding (47/786 nt) intergenic (-56/+228) coding (10/528 nt) coding (3514/6381 nt) coding (1042/1119 nt) coding (1176/1857 nt) coding (318/1038 nt) intergenic (+53/-21) coding (219/462 nt) coding (558/561 nt) coding (1003/1155 nt) coding (881/1104 nt) intergenic (-789/-109) coding (526/666 nt) coding (843/1137 nt) intergenic (-1352/-360) coding (304/636 nt) intergenic (+30/+36) coding (1726/1779 nt) coding (665/741 nt) coding (1525/1614 nt) intergenic (-266/+582) coding (189/822 nt) coding (1498/2337 nt) coding (302/2430 nt) coding (1308/1383 nt) coding (2654/4851 nt) intergenic (-92/-11) coding (635/1347 nt) intergenic (+69/+6) coding (799/816 nt) coding (246/387 nt) coding (1/1611 nt) intergenic (+88/-102) coding (1756/2361 nt) coding (751/930 nt) intergenic (+31/-98) coding (947/1515 nt) coding (7/792 nt) coding (1044/1173 nt) coding (1127/1209 nt) coding (804/2190 nt) coding (1246/1383 nt) coding (497/633 nt) coding (217/2316 nt) coding (1079/1278 nt) coding (41/411 nt) coding (533/591 nt) between 53 bp Mycobacterial Interspersed Repetitive Unit, Class II coding (870/933 nt) coding (866/969 nt) intergenic (+28/-34) coding (245/510 nt) intergenic (+30/-106) coding (2429/2892 nt) coding (936-950/2892 nt) coding (7/615 nt) coding (6596/9903 nt) coding (194/372 nt) coding (388/444 nt) coding (484/1173 nt) intergenic (-27/+14) coding (393/414 nt) intergenic (+100/-14) coding (2181-2189/2562 nt) intergenic (+129/-6) coding (1602-1604/1689 nt) coding (548-560/1881 nt) coding (113-121/441 nt) coding (811/981 nt) coding (201/309 nt) coding (1034-1042/1113 nt) coding (176-178/435 nt) intergenic (+79/+40) coding (703/1431 nt) coding (1328-1329/1599 nt) coding (254/2745 nt) coding (592/666 nt) coding (272-277/1458 nt) coding (2025-2027/2064 nt) coding (91/498 nt) coding (1079-1081/2520 nt) coding (343/516 nt) coding (533-541/1134 nt) coding (630/954 nt) coding (903-908/2166 nt) intergenic (-592/-1118) intergenic (+123/+418) intergenic (+15/+12) coding (725-726/1803 nt) coding (78/1938 nt) coding (322/420 nt) coding (989-990/1083 nt) coding (12/1692 nt) intergenic (-19/+9) coding (2162/3414 nt) coding (3130-3147/4983 nt) coding (851/4983 nt) coding (190/294 nt) intergenic (+137/-3) coding (80-88/258 nt) coding (7/570 nt) coding (250-251/255 nt) intergenic (-244/+138) intergenic (-223/+241) coding (486-491/1524 nt) coding (112/159 nt) intergenic (-38/+32) coding (2695/2913 nt) coding (2757/7572 nt) intergenic (-218/+206) coding (1281-1283/1737 nt) coding (1214/1233 nt) intergenic (-83/+151) coding (320/333 nt) coding (634/786 nt) coding (140/828 nt) coding (299/378 nt) coding (4/378 nt) coding (201/243 nt) coding (538-549/948 nt) intergenic (-15/-262) coding (756-764/792 nt) intergenic (-75/-253) coding (554/2190 nt) coding (167-175/450 nt) coding (476/669 nt) coding (615-619/1509 nt) coding (345/462 nt) coding (174/294 nt) coding (481/1044 nt) intergenic (-101/-187) coding (546/1077 nt) between 36 bp direct repeat coding (330/1200 nt) coding (134/1608 nt) coding (896/2568 nt) intergenic (+216/-6) coding (179/4875 nt) coding (105/324 nt) coding (597/2190 nt) coding (399-403/882 nt) coding (214/1515 nt) coding (174/1470 nt) coding (438/1029 nt) coding (1401/1731 nt) intergenic (-410/+58) intergenic (-289/-194) coding (1280-1547/1755 nt) coding (727-729/1755 nt) intergenic (-343/-52) coding (195-206/531 nt) coding (804/1203 nt) coding (399/1155 nt) coding (45/1050 nt) intergenic (-282/-206) coding (60/786 nt) coding (669/1689 nt) coding (1388/1527 nt) coding (1529-1546/2337 nt) between 36 bp direct repeat coding (10/666 nt) coding (2000/7572 nt) coding (525-527/945 nt)

gene description Rv0026 → hypothetical protein ctpI ← cation-transporter ATPase I Rv0108c ← / → PE_PGRS1 hypothetical protein/PE-PGRS family protein hddA → D-alpha-D-heptose-7-phosphate kinase Rv0165c ← GntR family transcriptional regulator Rv0194 → / → Rv0195 drugs-transport transmembrane ATP-binding protein ABC transporter/two component transcriptional regulatory protein Rv0197 → oxidoreductase Rv0239 → hypothetical protein fadA2 → / ← fadE5 acetyl-CoA acetyltransferase/acyl-CoA dehydrogenase FADE5 PPE7 ← PPE family protein PPE8 ← PPE family protein PPE9 ← PPE family protein PPE9 ← PPE family protein galE2 → UDP-glucose 4-epimerase fadD8 ← acyl-CoA synthetase mce2B → MCE-family protein MCE2B Rv0654 → dioxygenase ptrBa → oligopeptidase B PPE13 ← PPE family protein Rv0907 → hypothetical protein accD2 ← acetyl-/propionyl-CoA carboxylase subunit beta Rv0976c ← / → PE_PGRS16 hypothetical protein/PE-PGRS family protein Rv1012 → hypothetical protein Rv1042c ← / ← Rv1043c IS like-2 transposase/hypothetical protein Rv1046c ← hypothetical protein mmpL13a → transmembrane transport protein MmpL13A Rv1222 → / → htrA hypothetical protein/serine protease HtrA rrl → ribosomal RNA 23S PE_PGRS24 ← PE-PGRS family protein lprF → lipoprotein LprF Rv1506c ← / ← Rv1507c hypothetical protein/hypothetical protein fadD11.1 → fatty-acid-CoA ligase polyketide synthase pks7 pks7 → Rv1668c ← macrolide-transport ATP-binding protein ABC transporter PE_PGRS31 → PE-PGRS family protein hypothetical protein Rv1841c ← Rv1861 → / → adhA transmembrane protein/alcohol dehydrogenase AdhA Rv1883c ← hypothetical protein Rv1888c ← transmembrane protein Rv1895 → dehydrogenase isocitrate lyase aceAa → mce3R ← / → yrbE3A TetR family transcriptional regulator/integral membrane protein YrbE3A Rv1975 → hypothetical protein hypothetical protein Rv2084 → Rv2097c ← / → Rv2100 hypothetical protein/hypothetical protein Rv2160A ← hypothetical protein Rv2248 → / ← glpD1 hypothetical protein/glycerol-3-phosphate dehydrogenase Rv2264c ← hypothetical protein hypothetical protein Rv2293c ← Rv2333c ← integral membrane transport protein membrane-associated phospholipase C/PPE family protein plcA ← / ← PPE38 Rv2407 → ribonuclease Z PE_PGRS46 ← PE-PGRS family protein hypothetical protein Rv2823c ← transposase Rv2885c ← pks1 ← polyketide synthase PKS1 prfB ← / → fprA peptide chain release factor 2/NADPH:adrenodoxin oxidoreductase FPRA (NADPH-ferredoxin reductase) flavin-containing monoamine oxidase aofH → hypothetical protein/SOJ/PARA-like protein Rv3212 → / ← Rv3213c Rv3234c ← hypothetical protein Rv3337 → hypothetical protein 1-deoxy-D-xylulose-5-phosphate synthase dxs2 ← transposase/PE-PGRS family protein Rv3387 → / → PE_PGRS52 Rv3401 → hypothetical protein Rv3725 → oxidoreductase hypothetical protein/hypothetical protein Rv3747 → / → Rv3748 PE-PGRS family protein PE_PGRS62 → Rv3833 → AraC family transcriptional regulator Rv3860 → hypothetical protein hypothetical protein Rv3864 → hypothetical protein Rv3879c ← Rv3881c ← hypothetical protein Rv3897c ← hypothetical protein 8-amino-7-oxononanoate synthase BioF2 bioF2 → serine hydroxymethyltransferase glyA ← Rv0095c ← hypothetical protein gmhA → phosphoheptose isomerase trehalose synthase TRES/hypothetical protein treS → / → Rv0127 hypothetical protein Rv0146 → Rv0176 → mce associated transmembrane protein Rv0238 → / → Rv0239 TetR family transcriptional regulator/hypothetical protein hypothetical protein Rv0268c ← hypothetical protein/TetR/ACRR family transcriptional regulator Rv0301 → / → Rv0302 PPE6 ← PPE family protein PPE6 ← PPE family protein muconolactone isomerase Rv0316 → PPE family protein PPE8 ← Rv0401 → transmembrane protein mmpS2 → membrane protein acyl-CoA dehydrogenase FADE9 fadE9 ← hypothetical protein/hypothetical protein Rv0759c ← / ← Rv0760c PE_PGRS12 → PE-PGRS family protein far → / → Rv0856 fatty-acid-CoA racemase/hypothetical protein PE-PGRS family protein PE_PGRS22 → PPE family protein/Esat-6 like protein esxK (Esat-6 like protein 3) PPE18 → / → esxK PE-PGRS family protein PE_PGRS23 ← Ser/Thr protein kinase H pknH ← hypothetical protein Rv1334 → glycolipid sulfotransferase Rv1373 → PE15 → PE family protein PE_PGRS29 ← PE-PGRS family protein hypothetical protein Rv1487 → acyl-CoA synthetase/transmembrane transport protein MmpL12 fadD25 → / ← mmpL12 anchored-membrane serine/threonine-protein kinase PKNF (protein kinase F) (STPK F) pknF → acyl-CoA synthetase fadD1 ← PE-PGRS family protein wag22 ← hypothetical protein Rv1815 → 6-phosphogluconate dehydrogenase gnd1 ← integral membrane protein Rv1877 → thiol peroxidase tpx → oxygenase Rv1937 → oxidoreductase Rv1939 → MCE-family lipoprotein LprM lprM → hypothetical protein Rv1996 → hypothetical protein Rv2082 → hypothetical protein/hypothetical protein Rv2097c ← / → Rv2100 PPE family protein/proteasome (alpha subunit) PrcA PPE36 → / ← prcA hypothetical protein/PE-PGRS family protein Rv2125 → / ← PE_PGRS37 long-chain-fatty-acid-CoA ligase fadD15 (fatty-acid-CoA synthetase) (fatty-acid-CoA synthase) fadD15 → hypothetical protein Rv2191 → flavoprotein Rv2250A → hypothetical protein Rv2262c ← ferredoxin-dependent nitrite reductase NIRA nirA → hypothetical protein/gamma-glutamyl phosphate reductase Rv2426c ← / ← proA LuxR family transcriptional regulator Rv2488c ← PE-PGRS family protein PE_PGRS43 ← PE-PGRS family protein PE_PGRS43 ← hypothetical protein Rv2561 → hypothetical protein/hypothetical protein Rv2630 → / → Rv2631 hypothetical protein Rv2706c ← hypothetical protein Rv2879c ← hypothetical protein Rv2975c ← PE family protein/transposase PE29 ← / ← Rv3023c glutaredoxin electron transport protein NrdH/hypothetical protein nrdH ← / ← Rv3054c ATP-dependent DNA ligase ligB → hypothetical protein Rv4008 → two component sensory transduction transcriptional regulatory protein MTRA/thymidylate kinase mtrA ← / ← tmk arylsulfatase AtsB atsB ← PPE family protein PPE54 ← PE-PGRS family protein/hypothetical protein PE_PGRS50 ← / ← Rv3346c cholesterol oxidase precursor choD ← transposase Rv3428c ← 50S ribosomal protein L13/ESAT-6 like protein ESXT rplM ← / ← esxT transmembrane protein Rv3453 → peroxidase BpoA bpoA ← hypothetical protein Rv3594 → hypothetical protein Rv3655c ← hypothetical protein Rv3655c ← cutinase precursor cut5a → PPE family protein PPE66 ← excisionase/integrase Rv3750c ← / → Rv3751 AraC family transcriptional regulator Rv3833 → transcriptional regulatory protein WHIB-like WHIB6/hypothetical protein whiB6 ← / → Rv3863 hypothetical protein Rv3879c ← hypothetical protein Rv3901c ← RNA polymerase sigma factor SigM sigM → acyl-CoA synthetase fadD17 → hypothetical protein Rv1883c ← hypothetical protein Rv2561 → hypothetical protein Rv2802c ← transposase/resolvase Rv0920c ← / → Rv0921 hypothetical protein Rv2205c ← hypothetical protein/transposase Rv2813 → / ← Rv2814c PPE family protein PPE69 ← oxidoreductase Rv0897c ← adhesion component transport transmembrane protein ABC transporter Rv0987 → aminotransferase/hypothetical protein Rv2294 → / → Rv2295 NAD-dependent glutamate dehydrogenase gdh ← hypothetical protein Rv3488 → hypothetical protein Rv3879c ← methoxy mycolic acid synthase mmaA3 ← oxidase Rv0846c ← hypothetical protein Rv1371 → glycosyltransferase Rv1500 → cytochrome' transport transmembrane ATP-binding protein ABC transporter CydC cydC ← PPE family protein/secreted ESAT-6 like protein ESXR (TB10.3) (ESAT-6 like protein 9) PPE46 ← / ← esxR esterase/lipase LipF/hypothetical protein lipF ← / → Rv3488 PE-PGRS family protein PE_PGRS58 ← PE-PGRS family protein PE_PGRS58 ← hypothetical protein/hypothetical protein Rv3642c ← / → Rv3643 hypothetical protein Rv3902c ← hypothetical protein Rv0293c ← hypothetical protein Rv3529c ← hypothetical protein Rv0690c ← hypothetical protein/PE family protein Rv1194c ← / → PE13 lipoprotein LprF lprF → fatty-acid-CoA ligase fadD34 → MCE-family protein MCE2D mce2D → PE-PGRS family protein PE_PGRS46 ← hypothetical protein/transposase Rv2813 → / ← Rv2814c phosphate transporter PhoU phoY1 ← PPE family protein PPE54 ← by short-chain dehydrogenase Rv3485c ← Downloaded from www.microbiologyresearch.org IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41