Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed ..... africanum lineage 6, M. mungi, M. orygis, M. bovis, etc.) due to two ...
Microbial Genomics Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: "Aethiops vetus" --Manuscript Draft-Manuscript Number:
MGEN-D-16-00001R1
Full Title:
Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: "Aethiops vetus"
Article Type:
Research Paper
Section/Category:
Microbial evolution and epidemiology: Phylogeography
Corresponding Author:
Hanna Nebenzahl-Guimaraes, MPH Rijksinstituut voor Volksgezondheid en Milieu NETHERLANDS
First Author:
Hanna Nebenzahl-Guimaraes, MPH
Order of Authors:
Hanna Nebenzahl-Guimaraes, MPH Solomon A Yimer Carol Holm-Hansen Jessica de Beer Roland Brosch Dick van Soolingen
Abstract:
Introduction: Lineage 7 of the Mycobacterium tuberculosis complex has recently been identified among strains originating from Ethiopia. Using different DNA typing techniques, this study provides additional information on the genetic heterogeneity of five lineage 7 strains collected in the Amhara Region of Ethiopia. Case Presentation: The results confirm the phylogenetic positioning of these strains between the ancient Lineage 1 and TbD1-deleted, modern Lineages 2, 3 and 4 of Mycobacterium tuberculosis. Four newly identified large sequence polymorphisms characteristic of the Amhara Region lineage 7 strains are described. While lineage 7 strains have been previously identified in the Woldiya area, we show that lineage 7 strains circulate in other parts of the Amhara Region and also among foreign-born individuals from Eritrea and Somalia in The Netherlands. Conclusion: For ease of documenting future identification of these strains in other geographical locations and recognizing the place of origin, we propose to assign lineage 7 strains the lineage name: "Aethiops vetus".
Downloaded from www.microbiologyresearch.org by 204.44.112.220 Powered by Editorial Manager® and IP: ProduXion Manager® from Aries Systems Corporation On: Mon, 16 May 2016 06:39:41
Manuscript Including References (Word document)
1 2 3
Click here to download Manuscript Including References (Word document) Microbial_Genomics_Revised_Manuscript.docx
Genomic characterization of Mycobacterium tuberculosis lineage 7 and a proposed name: “Aethiops vetus”
4 5 6 7 8
ABSTRACT
9
Lineage 7 of the Mycobacterium tuberculosis complex has recently been
10
identified among strains originating from Ethiopia. Using different DNA typing
11
techniques, this study provides additional information on the genetic
12
heterogeneity of five lineage 7 strains collected in the Amhara Region of
13
Ethiopia. It also confirms the phylogenetic positioning of these strains between
14
the ancient Lineage 1 and TbD1-deleted, modern Lineages 2, 3 and 4 of
15
Mycobacterium
16
polymorphisms characteristic of the Amhara Region lineage 7 strains are
17
described. While lineage 7 strains have been previously identified in the Woldiya
18
area, we show that lineage 7 strains circulate in other parts of the Amhara
19
Region and also among foreign-born individuals from Eritrea and Somalia in The
20
Netherlands. For ease of documenting future identification of these strains in
21
other geographical locations and recognizing the place of origin, we propose to
22
assign lineage 7 strains the lineage name: “Aethiops vetus”.
tuberculosis.
Four
newly
identified
large
sequence
23 24 25
DATA SUMMARY
26 27 28
Reads and assembled scaffolds have been deposited in the European Nucleotide Archive (study accession number: PRJEB8432). (url http://www.ebi.ac.uk/ena/data/view/PRJEB8432)
29 1 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
30 31 32
I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ☒
33 34 35
IMPACT STATEMENT
36 37 38 39 40 41 42 43 44 45 46 47 48 49
This article communicates novel insights gained from genomic data on the new lineage 7 lineage of Mycobacterium tuberculosis (Mtb), namely to describe its positioning relative to other Mtb strains and specific variations (SNPs, insertions and deletions) that distinguish it from other lineages and (sub) species in the complex. This novel information will contribute towards the burgeoning awareness of this lineage that is of considerable evolutionary interest due to its intermediary positioning between ancient and modern lineages. Further investigation of differences in virulence and immunogenicity among Mtb major lineages, including lineage 7, is warranted to elucidate the pathogenesis and clinical consequences of TB disease. For ease of documenting the global epidemiology and anticipating future trends, we propose to name the lineage Aethiops vetus, which reflects the ancient origin of these strains in the Horn of Africa.
50 51 52
INTRODUCTION
53 54
A new lineage within the Mycobacterium tuberculosis complex (MTBC) has
55
recently been identified among strains originating from Ethiopia (Firdessa et al.
56
2013; Blouin et al. 2012). This lineage, named lineage 7, is of considerable
57
interest regarding evolutionary research as it represents a phylogenetic branch
58
intermediate between the ancient and modern lineages of Mycobacterium
59
tuberculosis (Mtb). Further characterization of this lineage may provide insight
60
as to where, when and how Mtb evolved, as well as what influence human hosts
61
had on the development of these bacteria. This is important for a better
62
understanding of Mtb adaptation to the human host, and the highly developed
63
ability of the bacterium to evade the immune system and spread extensively in 2 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
64
the human population. Characterization is thus essential to elucidate
65
pathogenicity, which underlies vaccine development, to document the global
66
epidemiology of tuberculosis (TB), and anticipate future trends in the spread of
67
the disease.
68 69
In this study we explore the positioning of this new lineage relative to other Mtb
70
strains through different DNA typing techniques, and identify the variations,
71
large sequence polymorphisms (LSPs) and smaller deletions, which distinguish
72
it from other groupings in the complex. In addition, we describe lineage 7 strains
73
detected in The Netherlands where the majority of the patients are foreign-born.
74
Lastly, to document the place where lineage 7 strains have been identified and
75
possibly originate, we suggest naming these strains ‘Aethiops vetus’.
76 77 78
METHODS
79 80
Six MTBC isolates were cultivated from sputum samples collected from newly
81
diagnosed pulmonary TB (PTB) patients attending selected health centers and
82
hospitals in various districts of Amhara Region, Ethiopia. The isolates were part
83
of 237 Mtb isolates collected between 2008 and 2010 for the TB Rapid Test
84
project at the Norwegian Institute of Public Health (NIPH). Bacteriological and
85
molecular characterization of the strains was performed in Ethiopia, Norway and
86
The Netherlands. Prior to sputum collection, socio-demographic and clinical
87
data were obtained using a structured questionnaire. HIV testing was also
88
performed on blood samples obtained from study participants following the
89
routine procedure in Ethiopia.
90 91
Sputum smear microscopy, culture and drug susceptibility testing
92
Three consecutive sputum samples, spot-morning-spot (Rao 2009), were
93
provided by study participants. Smear microscopy using Ziehl-Neelsen staining 3 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
94
was performed to identify acid fast bacilli. Pooled sputum samples were
95
transported to the Armauer Hansen Research Institute (AHRI) in Addis Abeba,
96
Ethiopia for culture. Sputum culture was performed using the conventional
97
Löwenstein–Jensen (LJ) culture media method (Christie & Callihan 1995). Drug
98
susceptibility testing (DST) was conducted at the Ethiopian Health and Nutrition
99
Research Institute (ENHRI), the national TB reference laboratory in Addis
100
Abeba, and at AHRI. DST was performed by the proportional absolute
101
concentration technique (Van Klingeren et al. 2007) and the MGIT
102
Mycobacterium
103
manufacturer’s instructions (Siddiqi, Rüsch-Gerdes 2006).
Growth
Indicator
Tube
(MGIT)
method
following
the
104 105
DNA extraction and spoligotyping
106
Deoxyribonucleic acid (DNA) was extracted from Mtb isolates according to
107
standard procedure (Zwadyk et al. 1994). Species identification by polymerase
108
chain reaction (PCR) targeting the RD9 (region of difference 9) was performed
109
as previously described (Parsons et al. 2002). Genomic regions of difference
110
(RDs) denote large deletions spanning several genes and are used for the
111
differentiation of members of the MTBC (Warren et al. 2006).
112
DNA of the Mtb isolates was subjected to spoligotyping at AHRI according to
113
Kamerbeek et al. 1997, and the results and DNA aliquots were sent to the
114
Tuberculosis Reference Laboratory at the National Institute for Public Health
115
and the Environment (RIVM) in The Netherlands for quality control. The results
116
indicated that the spoligotyping performed at AHRI was in agreement with the
117
findings at RIVM (Yimer et al. 2013). The International Spoligotype Database
118
(SpolDB4, 2011) was used to assign the spoligotyping patterns to major genetic
119
lineages and sub-lineages (Brudey et al. 2006). New spoligo patterns that were
120
not described by SpolDB4 were assigned to Mtb families by SpolDB3 and a
121
randomly initialized model (Run SPOTCLUST, 2011).
122 123
MIRU-VNTR typing
4 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
124
Mycobacterial interspersed repetitive units (MIRU) variable number of tandem
125
repeats (VNTR) typing was performed at NIPH in Oslo, Norway to improve the
126
discrimination of Mtb isolates and to compare the results with the extended
127
collection of VNTR profiles of strains collected within The Netherlands. VNTR is
128
based on the identification of 24 of the most polymorphic microsatellite
129
sequences (55-72 nucleotides in length) in the Mtb genome (Supply et al. 2006).
130
The findings are presented as a set of digits that indicate the number of tandem
131
repeat sequences at each of the 24 loci (Alonso-Rodríguez et al. 2008).
132 133
Sequencing, assembly and variant calling
134
Among the six isolates transferred to RIVM for whole genome sequencing
135
(WGS), one (Solo 80) could not be sub cultured despite repeated attempts. DNA
136
was thus extracted from five isolates using standard methods. WGS was
137
successfully performed by BaseClear BV (Leiden, The Netherlands) on an
138
Illumina HiSeq 2500 instrument using reads of 50 bp in length in the paired-end
139
modus. The average genome coverage was approximately 60-fold. The FASTQ
140
sequence reads were generated using the Illumina Casava pipeline version
141
1.8.3. Initial quality assessment was based on data passing the Illumina Chastity
142
filtering. Subsequently, reads containing adapters and/or PhiX control signals
143
were removed using an in-house filtering protocol. The second quality
144
assessment was based on the remaining reads using the FASTQC quality
145
control tool version 0.10.0. The quality of the FASTQ sequences was enhanced
146
by trimming off low-quality bases using the “Trim sequences” option of the CLC
147
Genomics Workbench version 6.5. The quality-filtered FASTQ sequences were
148
assembled using Ray, and the resulting contigs were mapped to both H37Rv
149
(GenBank accession NC_000962.3) and M. africanum (GenBank accession
150
FR878060.1). In parallel, single nucleotide polymorphisms (SNPs) and INDELS
151
(insertions or deletions) were called using Breseq software (version 0.23) with a
152
minimum depth of 15x (Barrick et al. 2009). SNPs with low-quality evidence (i.e.
153
possible mixed read alignment) or within 5 base pairs (bp) of an INDEL
154
(insertion or deletion) were discarded. Reads and assembled scaffolds are
155
deposited in the European Nucleotide Archive (study accession number: 5 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
156
PRJEB8432).
157 158
Phylogenetic tree construction and comparison of sequences
159
A parsimony-based phylogeny (Phylip, version 3.695) of the MTBC was
160
constructed using 78,613 concatenated variable positions (SNPs) that were
161
found in at least one of 100 previously characterized RIVM strains of different
162
lineages and the five strains from Ethiopia in relation to the outgroup strain
163
Mycobaterium canettii CIPT 140070010 (strain STB-K - GenBank accession
164
NC_019951.1). This strain was previously determined to be the most
165
genomically distant relative to the core members of the MTBC (Supply et al.,
166
2013; Blouin et al., 2014; Boritsch et al., 2016). Two additional lineage 7 strains
167
(Percy 256 and Percy 556) from an independent study were added to this tree
168
(Blouin et al. 2012). The Artemis Comparison Tool (ACT) (Carver et al, 2008)
169
was used to perform pairwise comparison between the different Ethiopian
170
sequences, Mtb H37Rv and M. africanum.
171 172 173
RESULTS
174 175
Socio-demographic information
176
The five sequenced isolates were obtained from two female and three male
177
study participants ranging in age from 15 to 50 years (mean age 32 years). All
178
patients came from rural areas from 5 different districts of the Amhara Region
179
(Fig. 1).
180 181
Lineage 7 strains detected in The Netherlands
182
Isolates from three patients, two males and one female, originating from
183
Somalia, Ethiopia and Eritrea, were previously collected in the Netherlands and
184
identified as belonging to the new lineage 7. The three lineage 7 isolates were 6 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
185
exceptional among the 3,867 unique MIRU profiles stored in The Netherlands
186
between 1993 and 2011. During this period, approximately 72% of the patients
187
were of foreign origin; 49% from Africa, 32% from Asia, and 11% from North or
188
South America. Two of the lineage 7 strains were isolated in 2008, one in 2009,
189
and all were sensitive to the 5 first-line antibiotics. Bacilli were isolated from
190
positive sputum samples and the patients had only pulmonary manifestation.
191
Contact investigation was performed for one case and one positive contact was
192
identified (confirmed by thorax X-ray). All three were treated with the standard
193
chemotherapeutic regimen and had a successful outcome.
194 195
VNTR characterization of lineage 7 strains
196
The 24-locus VNTR typing patterns (the complete variation of patterns) of the
197
eight lineage 7 strains (five isolated in Ethiopia and three in the Netherlands) are
198
presented in Table 1. The similarity in VNTR patterns between the three lineage
199
7 strains found in The Netherlands with two of the present lineage 7 strains from
200
Ethiopia ranged from 79.2% to 95.8% (average pairwise similarity of 86.1%).
201 202
SNP characterization of five Ethiopian strains
203
The five Ethiopian strains shared an average of 940 SNPs (range 888-986) and
204
differed by an average of 183 SNPs (range 113-265) by pairwise comparisons.
205
In the SNP-based phylogenetic tree, the five Ethiopian isolates clustered into
206
one unique branch located between the ancient lineage 1 (East African Indian
207
strains) and modern lineages 2, 3, and 4 (Beijing, Central Asian Strain (CAS)
208
and Euro-American) (Fig. 2).
209 210
LSP characterization of five Ethiopian strains by presence/absence of RDs and
211
newly identified deletions
212
7 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
213
All five Ethiopian strains contain all the genomic regions that are absent from M.
214
bovis BCG Pasteur (RD1 through RD9), with the exception of the RD3 region
215
(corresponding to the prophage region PhiRv1) and RD11 (corresponding to the
216
prophage region PhiRv2) (Fig. 3). The RD6 region was not tested as it contains
217
Insertion Sequence (IS) elements. While absent in M. tuberculosis H37Rv, the
218
TbD1 region, RvD1-RvD5 and RD239 (or LSP 239) as described by Brosch et
219
al., 2002 and Gagneux et al. 2006 were conserved in the five Ethiopian strains.
220
Figure 3 visually summarizes these observations and shows the positioning of
221
the five Ethiopian strains within the MTBC.
222 223
Four additional specific deletions (indicated as Δ1–Δ4 in Fig. 3) were found
224
across all five strains: a 10bp deletion in the mmpl9 gene, a 27bp deletion in the
225
lppH gene (suppl. Fig. 1A & B), a 1.3kb fragment spanning genes lppO/sseB
226
(suppl. Fig. 2), and a fragment of 3.3kb spanning the genes rv3467/rmlB2/mhpE
227
(suppl. Fig. 3). All four deletions were also confirmed in the lineage 7 Percy256
228
strain (Blouin et al. 2012). Finally, in the region encompassing RD7 in Mtb
229
H37Rv, the five Ethiopian strains present an extra 4.4kb fragment that is absent
230
from Mtb lineage 4 strains as well from all RD7-deleted MTBC members (M.
231
africanum lineage 6, M. mungi, M. orygis, M. bovis, etc.) due to two independent
232
deletion events (suppl. Fig. 4). However, this region seems to be present in all
233
other tubercle bacilli (suppl. Fig. 4), including the M. canettii strains, in which the
234
genes
235
MCAN_19951, MCAN_19961 (Supply et al. 2013). Similarly, another genomic
236
region of approximately 5kb in the proximity of the pknH gene was present in the
237
five Ethiopian strains and in M. canettii. This region is absent from Mtb H37Rv
238
(suppl. Fig. 5) and many other Mtb strains of the modern lineages. An overview
239
of what is known to date about the function of the genes/regions where these
240
specific deletions take place is presented in suppl. Table 1. All additional smaller
241
INDELs found amongst the five Ethiopian strains, mostly of one base pair, are
242
listed in suppl. Table 2.
correspond
to
the
orthologs
of
MCAN_19931,
MCAN_19941,
243 244 8 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
245
DISCUSSION
246 247
Mycobacterium tuberculosis, a member of the MTBC, has been a significant
248
cause of morbidity and mortality among humans for thousands of years.
249
Phylogenetic studies have shown that the MTBC comprises at least seven
250
distinct lineages distributed in different geographical locations around the world.
251
Current evidence indicates that the Horn of Africa is the most likely origin of
252
MTBC from which it has spread to other parts of the world (Comas et al. 2013).
253 254
Ethiopia is a country in the Horn of Africa with an estimated TB incidence of 224
255
cases/100,000 population (Global Tuberculosis Report, 2014). Molecular typing
256
of isolates from various regions indicates that all major MTBC lineages are
257
represented among TB patients in Ethiopia. A recent study conducted in the
258
Amhara Region of Ethiopia identified a high proportion of “Unknown” spoligotype
259
patterns (SIT910 and SIT1729) in addition to strains not previously described in
260
the database (Yimer et al. 2013). These strains have since been included in the
261
recently described Mtb lineage 7.
262 263
The current study focused on the further characterization of five lineage 7
264
strains isolated in different parts of the Amhara Region in Ethiopia, the second
265
most populous region in the country with an estimated 20 million inhabitants.
266
One of the strains matched the SIT910 pattern of lineage 7 Percy556 from
267
Djibouti (Blouin et al. 2012), whilst another matched SIT1729. The results
268
indicate that lineage 7 strains form a separate group as demonstrated by VNTR
269
and SNP typing (Van Klingeren et al. 2007). Both the analysis of RDs and SNPs
270
in our study confirms the positioning of lineage 7 strains between ancient and
271
modern lineages, as proposed by the SNP tree in Firdessa et al. 2013 and
272
Comas et al. 2015. The inclusion of the phylogenetically most distant M. canettii
273
strain STB-K (CIPT 1400700100) (Supply et al., 2013; Blouin et al., 2014;
274
Boritsch et al., 2016) as the outgroup has reinforced the clusters observed
275
among the different Mtb. 9 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
276 277
We found four newly identified deletions characteristic of all five lineage 7
278
strains from the Amhara Region and the Percy256 strain. Our study also
279
provides additional information on the genetic heterogeneity of Mtb lineage 7
280
strains. The strains in our dataset were found to be more heterogeneous
281
(average of 183 SNPs ranging from 113 to 265) than the two Percy (256 and
282
556) strains described in Blouin et al. 2012 that differed by only 8 SNPs. While
283
Firdessa et al. 2013 identified lineage 7 strains in the Woldiya area, we showed
284
that lineage 7 strains also circulate in other parts of the Amhara Region and
285
among foreign-born patients from the Horn of Africa residing in The
286
Netherlands.
287 288
There is a growing interest in conducting phylogeographic studies on MTBC
289
isolates and it is likely that lineage 7 strains may be identified in other parts of
290
the world. Mtb lineage 7 strains have previously been reported in Ethiopia
291
(Firdessa et al. 2013; Yimer et al. 2013, Yimer et al. 2015) and among Ethiopian
292
immigrants in Djibouti (Blouin et al. 2012). However, a specific designation that
293
can distinguish these particular strains from other sub lineages has not been
294
proposed. It might therefore be opportune to designate this lineage “Aethiops
295
vetus”. This name reflects the ancient origin of the new strains identified to date
296
in Ethiopia and among Ethiopian immigrants abroad.
297 298
This geographical restriction is also in accordance with the low representation of
299
lineage 7 strains in The Netherlands. Between 1993 and 2011 there have been
300
2,455 TB cases among foreign-born patients originating from the Horn of Africa
301
in The Netherlands; approximately a third of the strains belong to the East
302
African Indian (EAI) lineage and the other two thirds are either CAS or Euro-
303
American.
304 305
Previous studies showed that the genomic diversity of MTBC has a significant
306
impact on virulence and disease presentation among TB patients (Manca et al. 10 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
307
1999; López-Granados et al. 2003; Hernandez Pando et al. 2010; Sreevatsan et
308
al. 1997). Further investigation of differences in virulence and immunogenicity
309
among MTBC major lineages, including lineage 7 from Ethiopia and the Horn of
310
Africa, is warranted to elucidate the pathogenesis, epidemiology and clinical
311
consequences on TB disease. These studies will provide a better understanding
312
of MTBC evolution and an indication of MTBC evolution under the current
313
treatment regimens.
314 315
Conclusion
316 317
The study confirms through different DNA typing techniques and whole genome
318
sequencing the phylogenetic positioning of Mtb lineage 7 strains between the
319
ancient lineage 1 and the TbD1-deleted, modern lineage 2, 3 and 4 strains. In
320
addition, we show that this new lineage is not widespread in The Netherlands
321
where the majority of TB patients are foreign-born. This may reflect that lineage
322
7 strains are not highly transmissible as compared to modern Mtb strains. To
323
recognize the place of origin we propose to designate the lineage 7 strains as
324
Aethiops vetus. Further studies are warranted to investigate the relevance of
325
this special Mtb strain lineage with respect to clinical disease and transmission
326
as compared to other members of MTBC in Ethiopia and the Horn of Africa.
327 328 329
ACKNOWLEDGEMENTS
330 331
This work was supported by the Research Council of Norway, GLOBVAC
332
programme (grant number192468/S50). RB acknowledges support from the
333
Fondation pour la Recherche Médicale (FRM) DEQ20130326471.
334 335
ABBREVIATIONS
336 11 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
337
MTBC: Mycobacterium tuberculosis complex
338
Mtb: Mycobacterium tuberculosis
339
TB: tuberculosis
340
LSPs: large sequence polymorphisms
341
DST: Drug susceptibility testing
342
MGIT: Mycobacterium Growth Indicator Tube
343
LJ: Löwenstein–Jensen
344 345
REFERENCES
346 347 348 349
Alonso-Rodríguez, N., Martinez-Lirola, M., Herranz, M., Sanchez-Benitez, M., Bouza, E. & de Viedma, DG. (2008). Evaluation of the new advanced 15-loci MIRU-VNTR genotyping tool in Mycobacterium tuberculosis molecular epidemiology studies. BMC microbiology 8, 34.
350 351 352
Barrick, J.E, Su Yu, DS., Yoon, SH., Jeong, H., Oh, TK., Schneider, D., Lenski, RE., Kim, JF. (2009). Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461(7268), 1243–1247.
353 354 355 356
Blouin, Y., Cazajous, G., Dehan, C., Soler, C., Vong, R., Hassan, MO., Hauck Y., Boulais, C., Andriamanantena, D., & other authors (2014). Progenitor “Mycobacterium canettii” clone responsible for lymph node tuberculosis epidemic, Djibouti. Emerging Infectious Diseases 20(1), 21–28.
357 358 359
Blouin, Y., Hauck, Y., Soler, C., Fabre, M., Vong, R., Dehan, C., Cazajous, C., Massoure PL., Kraemer, P., & other authors (2012). Significance of the Identification in the Horn of Africa of an Exceptionally Deep Branching Mycobacterium tuberculosis Clade. PLoS ONE, 7(12):e52841.
360 361 362 363
Brudey, K., Driscoll JR., Rigouts, L., Prodinger, WM., Gori, A., Al-Hajoj, SA., Allix, C., Aristimuno, L., Arora, J., & other authors (2006). Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification, population genetics and epidemiology. BMC microbiology, 6, 23.
364 365
Christie, J.D. & Callihan, D.R. (1995). The laboratory diagnosis of mycobacterial diseases. Challenges and common sense. Clinics in laboratory medicine, 15(2), 279–306.
366 367 368
Comas, I.., Coscolla, M., Luo, T., Borrell, S., Holt, KE., Kato-Maeda, M., Parkhill, J., Malla, B., Berg, S, & author authors (2013). Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nature genetics, 45(10), 1176–82.
369 370 371
Firdessa, R., Berg, S., Hailu, E., Schelling, E., Gumi, B., Erenso, G., Gadisa, E., Kiros, T., Habtamu, M., & other authors (2013). Mycobacterial lineages causing pulmonary and extrapulmonary tuberculosis, Ethiopia. Emerging Infectious Diseases 19(3), 460–463.
372 373
Hernandez Pando, R., Aguilar, D., Cohen, I., Guerrero, M., Ribon, W., Acosta, P., Orozco, H., Marquina, B., Salinas, C., & other authors (2010). Specific bacterial genotypes of
12 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
374 375
Mycobacterium tuberculosis cause extensive dissemination and brain infection in an experimental model. Tuberculosis, 90(4), 268–277.
376 377 378 379
Kamerbeek, J., Schouls, L., Kolk, A., van Agterveld, M., van Soolingen, D., Kuijper, S., Bunschoten, A., Molhuizen, H., Shaw, R., & other authors (1997). Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. Journal of Clinical Microbiology, 35(4), 907–914.
380 381 382 383
Van Klingeren, B., Dessens-Kroon, M., van der Laan, T., Kremer, K., van Soolingen, D. (2007). Drug susceptibility testing of Mycobacterium tuberculosis complex by use of a high-throughput, reproducible, absolute concentration method. Journal of Clinical Microbiology, 45(8), 2662– 2668.
384 385 386
López-Granados, E., Cambronero, R., Ferreira, A., Fontan, G., Garcia-Rodriguez, MC. (2003). Three novel mutations reflect the variety of defects causing phenotypically diverse X-linked hyper-IgM syndrome. Clinical and Experimental Immunology, 133(1), 123–131.
387 388 389 390
Manca, C., Tsenova, L., Barry, CE., Bergtold, A., Freeman, S., Haslett, PA., Musser, JM., Freedman, VH., Kaplan, G. (1999). Mycobacterium tuberculosis CDC1551 induces a more vigorous host response in vivo and in vitro, but is not more virulent than other clinical isolates. Journal of immunology (Baltimore, Md. : 1950), 162(11), 6740–6746.
391 392 393
Mariam, S.H. (2009). Interaction between lactic acid bacteria and Mycobacterium bovis in ethiopian fermented milk: Insight into the fate of M. bovis. Applied and Environmental Microbiology, 75(6), 1790–1792.
394 395 396 397
Parsons, L.M., Brosch, R., Cole, ST., Somoskovi, A., Loder, A., Bretzel, G., van Soolingen, D., Hale, YM., Salfinger, M. (2002). Rapid and simple approach for identification of Mycobacterium tuberculosis complex isolates by PCR-based genomic deletion analysis. Journal of Clinical Microbiology, 40(7), 2339–2345.
398 399 400
Rao, S. (2009). Sputum smear microscopy in DOTS: Are three samples necessary? An analysis and its implications in tuberculosis control. Lung India : official organ of Indian Chest Society, 26(1),3– 4.
401 402 403 404
Gagneux, S., DeRiemer, K., van T., Kato-Maeda, M., de Jong, BC., Narayanan, S., Nicol, M., Niemann, S., Kremer, K., & other authors (2006). Variable host – pathogen compatibility in Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences 103(8), 2869– 2873.
405 406
Siddiqi, SH., Rüsch-Gerdes, S., 2006. MGIT Procedure Manual. Geneva, Switzerland: Foundation for Innovative New Diagnostics. July, 2006.
407 408 409 410
Sreevatsan, S., Pan, X., Stockbauer, KE., Connell, KD., Kreiswirth, BN., Whittam, TS., Musser, JM. (1997). Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proceedings of the National Academy of Sciences of the United States of America, 94(18), 9869–9874.
411 412 413 414
Supply, P., Marceau, M., Mangenot, S., Roche, D., Rouanet, C., Khanna, V., Majlessi, L., Criscuolo, A., Tap, J. & other authors (2013). Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis. Nature genetics, 45(2), 172-9.
415 416
Supply, P., Allix, C., Lesjean, S., Cardoso-Oelemann, M., Rusch-Gerdes, S., Willery, E., Savine, E., de Haas, P., van Deutekom, H. & other authors (2006). Proposal for standardization of
13 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
417 418
optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. Journal of Clinical Microbiology, 44(12), 4498–4510.
419 420 421 422
Warren, RM., Gey Van Pittius, NC., Barnard, M., Hesseling, A., Engelke, E., De Kock, M., Gutierrez, MC., Victor, TC., Hoal, EG. & other authors (2006). Differentiation of Mycobacterium tuberculosis complex by PCR amplification of genomic regions of difference. International Journal of Tuberculosis and Lung Disease, 10(7): 818-822.
423 424 425 426
Yimer, S.A., Hailu, E., Derese, Y., Bjune, GA., Holm-Hansen, C. (2013). Spoligotyping of Mycobacterium tuberculosis isolates among pulmonary tuberculosis patients in Amhara Region, Ethiopia. APMIS : acta pathologica, microbiologica, et immunologica Scandinavica, 121(9), 878–85.
427 428 429
Zwadyk, P., Down, JA, Myers, N., Dey, MS. (1994). Rendering of mycobacteria safe for molecular diagnostic studies and development of a lysis method for strand displacement amplification and PCR. Journal of Clinical Microbiology, 32(9), 2140–2146.
430 431 432 433 434 435 436
DATA BIBLIOGRAPHY
437
Nebenzahl-Guimaraes, H. European Nucleotide Archive. PRJEB8432 (ERP009510). (2015).
438 439 440 441
FIGURES AND TABLES
442 443
Figure Legends
444 445
Fig. 1 Map of Ethiopia and the Amhara Region locating specific sites where the six
446
isolates were collected.
447
Fig. 2 Parsimony-based phylogeny of Mtb using 78,613 concatenated variable
448
positions (SNPs) found in at least one of 100 previously characterized RIVM 14 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
449
strains of different lineages called against M. canettii CIPT 140070010, the
450
genomically most distant strain among the M. canettii taxon (Supply et al., 2013;
451
Blouin et al., 2014; Boritsch et al., 2016),
452 453
Fig. 3 Phylogenetic lineage classification based on the presence or absence of Regions
454
of Difference (RD) and smaller deletions.
455 456
Table 1 24-locus MIRU-VNTR typing patterns of eight lineage 7 strains.
457 458
Supplementary Material Legends
459 460
Suppl. Fig. 1:
461
Aethiops vetus strains. Panel A shows the BLAST comparison of the nucleotide
462
sequence of gene mmpl9 from Aethiops vetus strain 262 used as query
463
sequence in comparison to the orthologous sequence of M. tuberculosis H37Rv
464
(sbjct). Panel B shows pairwise linear genomic comparisons of M. canettii,
465
Aethiops vetus, M. tuberculosis H37Rv, and M. bovis AF2122/97. Red lines
466
indicate collinear blocks of DNA-DNA similarity, whereas the white triangle
467
indicates a specific 27bp deletion in the orthologous lppH region of M.
468
tuberculosis lineage 7 Aethiops vetus.
Identification of small genomic deletions in the genomes of
469 470
Suppl. Fig. 2:
471
concerning the orthologues of genes lppO and sseB. Panel A shows the linear
472
genomic comparisons of M. canettii, M. tuberculosis strains Aethiops vetus,
473
H37Rv, and M. bovis AF2122/97. Red lines indicate collinear blocks of DNA-DNA
474
similarity, whereas the white triangles indicate the specific region absent from
475
Aethiops vetus/lineage 7 strains. Panel B shows the BLASTN results obtained
476
with the Aethiops vetus junction region of strain 262, used as query sequence,
Identification of a specific 1.3 kb deletion in Aethiops vetus,
15 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
477
which confirms the specific 1.3 kb deletion relative to M. tuberculosis H37Rv
478
(sbjct).
479 480
Suppl. Fig. 3:
481
Aethiops vetus strains concerning the orthologues of genes rv3467/rmlB2/mhpE.
482
Panel A shows the linear genomic comparisons of M. canettii, M. tuberculosis
483
Aethiops vetus, H37Rv, and M. bovis AF2122/97. Red lines indicate collinear
484
blocks of DNA-DNA similarity, whereas the white triangles indicate the specific
485
region absent from Aethiops vetus/lineage 7 strains. Panel B shows the BLASTN
486
results obtained with the Aethiops vetus junction region of strain 262, used as
487
query sequence, which confirms the specific 3,331 bp deletion relative to M.
488
tuberculosis H37Rv (sbjct).
Identification of a specific 3.3 kb deletion in M. tuberculosis
489 490
Suppl. Fig. 4:
491
Aethiops vetus, H37Rv, and M. bovis AF2122/97 sequences, covering the region
492
of difference RD7. Red lines indicate collinear blocks of DNA-DNA similarity,
493
whereas the white triangles indicate regions absent from selected strain lineages.
494
The figure shows that in M. canettii and Aethiops vetus the genomic region
495
encompassing orthologues of genes MCAN_19931-19961 are missing from M.
496
tuberculosis H37Rv and other lineage 4 strains. An overlapping portion of this
497
genomic region, located close to the terminus of replication region, was also
498
subject of an independent genomic deletion (RD7) in the progenitor of M.
499
africanum lineage 6 strains and animal strain lineages, including M. bovis.
Linear genomic comparison of M. canettii, M. tuberculosis
500 501
Suppl. Fig. 5:
502
Aethiops vetus, H37Rv, and M. bovis AF2122/97 sequences, covering the region
503
encoding for serine threonine kinase PknH. Red lines indicate collinear blocks of
504
DNA-DNA similarity, whereas the white triangles indicate regions absent from M.
505
tuberculosis H37Rv due to recombination of two pknH copies and deletion of the
506
adjacent gene (orthologue of MCAN_12811). Interestingly, the full pknH region
507
was also described as being present in M. africanum, which argues that the
Linear genomic comparison of M. canettii, M. tuberculosis
16 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
508
recombined region in M. bovis is due to an independent recombination event in
509
the M. africanum-M. bovis lineage, which is also suggested by the sequence
510
differences in the pknH gene between M. tuberculosis H37Rv and M. bovis, (as
511
indicated by a white block of distant similarity).
512 513
Suppl. Table 1: Additional smaller INDELs found amongst the 5 M. tuberculosis
514
lineage 7/Aethiops vetus strains (against reference strain H37Rv).
515 516
Suppl. Table 2: Overview of relevant functional studies on genes/regions where
517
LSPs/deletions specific to lineage 7 Aethiops vetus occur.
17 Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Research paper template
518
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Figure 1
Click here to download Figure Figure_1.tif
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Figure 2
Click here to download Figure withPercys_Mar28 (1).png
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Figure 3
Click here to download Figure Figure_3.tif
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Table 1
Click here to download Table Table 1 copy.xls
Sequence IDIsolated in 68 The 838 Netherlands 54 229 230 232 Ethiopia 233 234
VNTR locus 580 2996 802 960 1644 3192 424 577 2165 2401 3690 4156 2163b 1955 4052 154 4 5 3 7 4 7 6 4 3 2 1 3 4 4 6 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 9 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 8 1 4 6 3 7 4 6 2 4 3 2 1 3 4 4 8 1 4 6 3 7 4 5 4 4 3 2 1 3 4 4 8 1 8 6 3 7 4 5 4 4 3 2 1 3 4 4 9 1 4 6 2 7 4 5 4 4 3 2 1 3 4 4 2 1 4 6 3 7 4 5 4 4 3 2 1 3 4 3 8 1
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
2531 4348 2059 2687 3007 2347 2461 3171 4 6 2 2 3 4 3 4 4 6 2 2 3 4 3 4 4 6 2 2 3 4 3 4 4 6 2 2 3 3 3 4 4 6 2 2 3 3 3 4 4 6 2 2 3 4 3 4 4 6 2 2 1 3 3 4 4 7 2 2 3 3 3 4
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Supplementary Material Files
Click here to download Supplementary Material Files Supplementary Figures and Tables - Mar22 (1).pdf
M.t uber cul osi sAet hi opsvet us
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
M.t uber cul osi sAet hi opsvet us
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
M.t uber cul osi sAet hi opsvet us
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
M.t uber cul osi sAet hi opsvet us
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
M.t uber cul osi sAet hi opsvet us
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Supplementary Table 1
Gene name
Rv number
Functional category
Function
mmpL9
Rv2339
Cell wall and cell processes
Unknown. Thought to be involved in fatty acid transport.
lppH lppO sseB Rv3467
Rv3576 Rv2290 Rv2291 Rv3467
Cell wall and cell processes Cell wall and cell processes Intermediary metabolism and respiration insertion seqs and phages
Unknown. Unknown. Unknown. Unknown.
galE1 (rmlB2) Rv3634c
Intermediary metabolism and respiration
Involved in galactofuranosyl biosynthesis.
mhpE
Rv3469c
Intermediary metabolism and respiration
Involved in aromatic hydrocarbons degradation
pknH
Rv1266c
Regulatory proteins
Involved in signal transduction (via phosphorylation).
Relevant findings Decrease in survival observed in mmpL9 mutant in hydrogen peroxide (Mestre O et al., 2013). MmpL-mediated lipid secretion contributes to innate ability of pathogen to survive intracellularly and contributes directly to host-pathogen dialogue that determines the ultimate outcome of infection (Domenech P., 2004).
Has been implicated in pathogenesis (Graham, JE et al., 1999 & McKinney, JD et al., 2000). rmlB2 genes observed to be upregulated in macrophages (Fontan, P., 2008), thus a putative bacterial pathogenic factor relevant for the intracellular survival of MTB. Its disruption in H37Rv confers a higher bacillary load (hypervirulence) during the chronic phase of infection in BALB/c mice (Papavinasasasundaram, KG et al., 2005). Its differential expression in response to stress conditions indicate its ability to regulate cellular events promoting bacterial adaptation to environmental change (Sharma, K., et al., 2004).
Downloaded from www.microbiologyresearch.org by IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41
Supplementary Table 2 position 30,980 125,830 131,174 139,556 194,305 230,576 234,496 289,125 293,628 424,320 424,790 467,497 467,508 592,759 641,719 688,792 750,089 874,835 976,897 1,010,204 1,086,490 1,090,188 1,131,229 1,165,521 1,168,715 1,273,250 1,365,837 1,474,959 1,488,557 1,541,066 1,696,499 1,753,519 1,878,817 1,894,301 2,001,789 2,090,401 2,109,523 2,133,474 2,137,522 2,143,523 2,161,343 2,207,591 2,218,577 2,342,650 2,358,029 2,421,975 2,523,205 2,536,625 2,564,368 2,606,797 2,632,341 2,704,885 2,960,944 3,131,472 3,194,241 3,293,700 3,473,996 3,539,139 3,590,686 3,610,391 3,723,901 3,794,867 3,801,551 3,819,797 4,170,964 4,197,138 4,277,517 4,306,873 4,337,820 4,341,396 4,358,979 4,360,680 4,383,144 34,511 77,818 105,175 137,851 154,179 173,080 208,317 289,070 323,029 364,499 373,283 374,762 384,541 428,084 479,982 597,146 843,931 854,253 925,343 952,811 1,218,649 1,340,653 1,385,074 1,415,281 1,502,753 1,546,822 1,561,664 1,655,680 1,677,116 1,714,132 1,972,840 1,978,239 1,992,324 2,058,119 2,094,912 2,127,928 2,183,462 2,189,574 2,192,436 2,214,655 2,239,633 2,339,611 2,357,269 2,368,565 2,387,186 2,448,884 2,453,896 2,525,723 2,534,563 2,684,690 2,724,202 2,798,719 2,803,090 2,805,386 2,881,598 2,957,569 3,020,370 3,190,146 3,331,362 3,381,237 3,415,181 3,426,069 3,580,637 3,627,387 3,683,269 3,734,179 3,742,992 3,827,445 3,844,757 3,862,473 3,874,723 3,890,100 4,035,996 4,095,002 4,095,297 4,169,667 4,189,684 4,198,612 4,307,622 4,338,596 4,359,229 4,386,640 4,400,661 3,925,504 2,133,348 2,881,582 3,111,343 1,026,917 2,469,918 3,120,177 4,375,354 1,002,282 1,103,437 2,566,766 2,782,084 3,907,771 4,359,186 737,747 943,981 1,543,532 1,691,287 1,820,293 3,378,653 3,907,467 4,031,612 4,032,430 4,081,108 4,387,690 357,261 3,966,640 791,029 1,338,795 1,541,079 37,927 691,888 2,960,896 3,119,884 3,687,568 3,734,936 3,905,040
mutation +CGAT +A +G +C +GG +T +GT +T +C +C +TA +G +G +A +C +G +C +CG +G +G +C +G +G +A +T +A +GG +CA +C +C +CGGT +C +G +10 bp +GG +12 bp +G +15 bp +10 bp +T +T +C +C +11 bp +G +GGAA +CGC +G +C +C +A +21 bp +C +9 bp +G +G +A +G +C +C +T +CA +T +CA +A +T +G +A +G +11 bp +G +A +CGGGG Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ53 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ15 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ3 bp Δ13 bp Δ9 bp Δ1 bp Δ1 bp Δ9 bp Δ3 bp Δ1 bp Δ1 bp Δ2 bp Δ1 bp Δ1 bp Δ6 bp Δ3 bp Δ1 bp Δ3 bp Δ1 bp Δ9 bp Δ1 bp Δ6 bp Δ3 bp Δ1 bp Δ5 bp Δ2 bp Δ1 bp Δ1 bp Δ2 bp Δ1 bp Δ20 bp Δ1 bp Δ18 bp Δ1 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ2 bp Δ1 bp Δ14 bp Δ6 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ3 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ12 bp Δ1 bp Δ9 bp Δ1 bp Δ1 bp Δ9 bp Δ1 bp Δ5 bp +G +T +C Δ1 bp Δ1 bp Δ144 bp Δ1 bp +CC +A +G +27 bp +T +54 bp Δ5 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ1 bp Δ7 bp Δ268 bp Δ3 bp Δ1 bp Δ12 bp +C +G Δ1 bp Δ3 bp Δ1 bp Δ1 bp Δ1 bp Δ18 bp Δ147 bp Δ1 bp Δ1 bp Δ3 bp
annotation coding (1259/1347 nt) coding (4712/4899 nt) intergenic (-70/-208) coding (1044/1161 nt) coding (634/795 nt) intergenic (+114/-323) coding (2266/2289 nt) coding (22/234 nt) intergenic (+135/+170) coding (375/426 nt) coding (9890/9903 nt) coding (505/543 nt) coding (494/543 nt) coding (1106/1131 nt) coding (1093/1716 nt) coding (761/828 nt) coding (90/1506 nt) coding (603/711 nt) coding (1307/1332 nt) coding (69/1599 nt) coding (856/1590 nt) intergenic (-13/-185) coding (102/294 nt) intergenic (-22/+260) coding (514/525 nt) coding (828/912 nt) intergenic (+29/-38) noncoding (1302/3138 nt) coding (1409/1812 nt) coding (47/786 nt) intergenic (-56/+228) coding (10/528 nt) coding (3514/6381 nt) coding (1042/1119 nt) coding (1176/1857 nt) coding (318/1038 nt) intergenic (+53/-21) coding (219/462 nt) coding (558/561 nt) coding (1003/1155 nt) coding (881/1104 nt) intergenic (-789/-109) coding (526/666 nt) coding (843/1137 nt) intergenic (-1352/-360) coding (304/636 nt) intergenic (+30/+36) coding (1726/1779 nt) coding (665/741 nt) coding (1525/1614 nt) intergenic (-266/+582) coding (189/822 nt) coding (1498/2337 nt) coding (302/2430 nt) coding (1308/1383 nt) coding (2654/4851 nt) intergenic (-92/-11) coding (635/1347 nt) intergenic (+69/+6) coding (799/816 nt) coding (246/387 nt) coding (1/1611 nt) intergenic (+88/-102) coding (1756/2361 nt) coding (751/930 nt) intergenic (+31/-98) coding (947/1515 nt) coding (7/792 nt) coding (1044/1173 nt) coding (1127/1209 nt) coding (804/2190 nt) coding (1246/1383 nt) coding (497/633 nt) coding (217/2316 nt) coding (1079/1278 nt) coding (41/411 nt) coding (533/591 nt) between 53 bp Mycobacterial Interspersed Repetitive Unit, Class II coding (870/933 nt) coding (866/969 nt) intergenic (+28/-34) coding (245/510 nt) intergenic (+30/-106) coding (2429/2892 nt) coding (936-950/2892 nt) coding (7/615 nt) coding (6596/9903 nt) coding (194/372 nt) coding (388/444 nt) coding (484/1173 nt) intergenic (-27/+14) coding (393/414 nt) intergenic (+100/-14) coding (2181-2189/2562 nt) intergenic (+129/-6) coding (1602-1604/1689 nt) coding (548-560/1881 nt) coding (113-121/441 nt) coding (811/981 nt) coding (201/309 nt) coding (1034-1042/1113 nt) coding (176-178/435 nt) intergenic (+79/+40) coding (703/1431 nt) coding (1328-1329/1599 nt) coding (254/2745 nt) coding (592/666 nt) coding (272-277/1458 nt) coding (2025-2027/2064 nt) coding (91/498 nt) coding (1079-1081/2520 nt) coding (343/516 nt) coding (533-541/1134 nt) coding (630/954 nt) coding (903-908/2166 nt) intergenic (-592/-1118) intergenic (+123/+418) intergenic (+15/+12) coding (725-726/1803 nt) coding (78/1938 nt) coding (322/420 nt) coding (989-990/1083 nt) coding (12/1692 nt) intergenic (-19/+9) coding (2162/3414 nt) coding (3130-3147/4983 nt) coding (851/4983 nt) coding (190/294 nt) intergenic (+137/-3) coding (80-88/258 nt) coding (7/570 nt) coding (250-251/255 nt) intergenic (-244/+138) intergenic (-223/+241) coding (486-491/1524 nt) coding (112/159 nt) intergenic (-38/+32) coding (2695/2913 nt) coding (2757/7572 nt) intergenic (-218/+206) coding (1281-1283/1737 nt) coding (1214/1233 nt) intergenic (-83/+151) coding (320/333 nt) coding (634/786 nt) coding (140/828 nt) coding (299/378 nt) coding (4/378 nt) coding (201/243 nt) coding (538-549/948 nt) intergenic (-15/-262) coding (756-764/792 nt) intergenic (-75/-253) coding (554/2190 nt) coding (167-175/450 nt) coding (476/669 nt) coding (615-619/1509 nt) coding (345/462 nt) coding (174/294 nt) coding (481/1044 nt) intergenic (-101/-187) coding (546/1077 nt) between 36 bp direct repeat coding (330/1200 nt) coding (134/1608 nt) coding (896/2568 nt) intergenic (+216/-6) coding (179/4875 nt) coding (105/324 nt) coding (597/2190 nt) coding (399-403/882 nt) coding (214/1515 nt) coding (174/1470 nt) coding (438/1029 nt) coding (1401/1731 nt) intergenic (-410/+58) intergenic (-289/-194) coding (1280-1547/1755 nt) coding (727-729/1755 nt) intergenic (-343/-52) coding (195-206/531 nt) coding (804/1203 nt) coding (399/1155 nt) coding (45/1050 nt) intergenic (-282/-206) coding (60/786 nt) coding (669/1689 nt) coding (1388/1527 nt) coding (1529-1546/2337 nt) between 36 bp direct repeat coding (10/666 nt) coding (2000/7572 nt) coding (525-527/945 nt)
gene description Rv0026 → hypothetical protein ctpI ← cation-transporter ATPase I Rv0108c ← / → PE_PGRS1 hypothetical protein/PE-PGRS family protein hddA → D-alpha-D-heptose-7-phosphate kinase Rv0165c ← GntR family transcriptional regulator Rv0194 → / → Rv0195 drugs-transport transmembrane ATP-binding protein ABC transporter/two component transcriptional regulatory protein Rv0197 → oxidoreductase Rv0239 → hypothetical protein fadA2 → / ← fadE5 acetyl-CoA acetyltransferase/acyl-CoA dehydrogenase FADE5 PPE7 ← PPE family protein PPE8 ← PPE family protein PPE9 ← PPE family protein PPE9 ← PPE family protein galE2 → UDP-glucose 4-epimerase fadD8 ← acyl-CoA synthetase mce2B → MCE-family protein MCE2B Rv0654 → dioxygenase ptrBa → oligopeptidase B PPE13 ← PPE family protein Rv0907 → hypothetical protein accD2 ← acetyl-/propionyl-CoA carboxylase subunit beta Rv0976c ← / → PE_PGRS16 hypothetical protein/PE-PGRS family protein Rv1012 → hypothetical protein Rv1042c ← / ← Rv1043c IS like-2 transposase/hypothetical protein Rv1046c ← hypothetical protein mmpL13a → transmembrane transport protein MmpL13A Rv1222 → / → htrA hypothetical protein/serine protease HtrA rrl → ribosomal RNA 23S PE_PGRS24 ← PE-PGRS family protein lprF → lipoprotein LprF Rv1506c ← / ← Rv1507c hypothetical protein/hypothetical protein fadD11.1 → fatty-acid-CoA ligase polyketide synthase pks7 pks7 → Rv1668c ← macrolide-transport ATP-binding protein ABC transporter PE_PGRS31 → PE-PGRS family protein hypothetical protein Rv1841c ← Rv1861 → / → adhA transmembrane protein/alcohol dehydrogenase AdhA Rv1883c ← hypothetical protein Rv1888c ← transmembrane protein Rv1895 → dehydrogenase isocitrate lyase aceAa → mce3R ← / → yrbE3A TetR family transcriptional regulator/integral membrane protein YrbE3A Rv1975 → hypothetical protein hypothetical protein Rv2084 → Rv2097c ← / → Rv2100 hypothetical protein/hypothetical protein Rv2160A ← hypothetical protein Rv2248 → / ← glpD1 hypothetical protein/glycerol-3-phosphate dehydrogenase Rv2264c ← hypothetical protein hypothetical protein Rv2293c ← Rv2333c ← integral membrane transport protein membrane-associated phospholipase C/PPE family protein plcA ← / ← PPE38 Rv2407 → ribonuclease Z PE_PGRS46 ← PE-PGRS family protein hypothetical protein Rv2823c ← transposase Rv2885c ← pks1 ← polyketide synthase PKS1 prfB ← / → fprA peptide chain release factor 2/NADPH:adrenodoxin oxidoreductase FPRA (NADPH-ferredoxin reductase) flavin-containing monoamine oxidase aofH → hypothetical protein/SOJ/PARA-like protein Rv3212 → / ← Rv3213c Rv3234c ← hypothetical protein Rv3337 → hypothetical protein 1-deoxy-D-xylulose-5-phosphate synthase dxs2 ← transposase/PE-PGRS family protein Rv3387 → / → PE_PGRS52 Rv3401 → hypothetical protein Rv3725 → oxidoreductase hypothetical protein/hypothetical protein Rv3747 → / → Rv3748 PE-PGRS family protein PE_PGRS62 → Rv3833 → AraC family transcriptional regulator Rv3860 → hypothetical protein hypothetical protein Rv3864 → hypothetical protein Rv3879c ← Rv3881c ← hypothetical protein Rv3897c ← hypothetical protein 8-amino-7-oxononanoate synthase BioF2 bioF2 → serine hydroxymethyltransferase glyA ← Rv0095c ← hypothetical protein gmhA → phosphoheptose isomerase trehalose synthase TRES/hypothetical protein treS → / → Rv0127 hypothetical protein Rv0146 → Rv0176 → mce associated transmembrane protein Rv0238 → / → Rv0239 TetR family transcriptional regulator/hypothetical protein hypothetical protein Rv0268c ← hypothetical protein/TetR/ACRR family transcriptional regulator Rv0301 → / → Rv0302 PPE6 ← PPE family protein PPE6 ← PPE family protein muconolactone isomerase Rv0316 → PPE family protein PPE8 ← Rv0401 → transmembrane protein mmpS2 → membrane protein acyl-CoA dehydrogenase FADE9 fadE9 ← hypothetical protein/hypothetical protein Rv0759c ← / ← Rv0760c PE_PGRS12 → PE-PGRS family protein far → / → Rv0856 fatty-acid-CoA racemase/hypothetical protein PE-PGRS family protein PE_PGRS22 → PPE family protein/Esat-6 like protein esxK (Esat-6 like protein 3) PPE18 → / → esxK PE-PGRS family protein PE_PGRS23 ← Ser/Thr protein kinase H pknH ← hypothetical protein Rv1334 → glycolipid sulfotransferase Rv1373 → PE15 → PE family protein PE_PGRS29 ← PE-PGRS family protein hypothetical protein Rv1487 → acyl-CoA synthetase/transmembrane transport protein MmpL12 fadD25 → / ← mmpL12 anchored-membrane serine/threonine-protein kinase PKNF (protein kinase F) (STPK F) pknF → acyl-CoA synthetase fadD1 ← PE-PGRS family protein wag22 ← hypothetical protein Rv1815 → 6-phosphogluconate dehydrogenase gnd1 ← integral membrane protein Rv1877 → thiol peroxidase tpx → oxygenase Rv1937 → oxidoreductase Rv1939 → MCE-family lipoprotein LprM lprM → hypothetical protein Rv1996 → hypothetical protein Rv2082 → hypothetical protein/hypothetical protein Rv2097c ← / → Rv2100 PPE family protein/proteasome (alpha subunit) PrcA PPE36 → / ← prcA hypothetical protein/PE-PGRS family protein Rv2125 → / ← PE_PGRS37 long-chain-fatty-acid-CoA ligase fadD15 (fatty-acid-CoA synthetase) (fatty-acid-CoA synthase) fadD15 → hypothetical protein Rv2191 → flavoprotein Rv2250A → hypothetical protein Rv2262c ← ferredoxin-dependent nitrite reductase NIRA nirA → hypothetical protein/gamma-glutamyl phosphate reductase Rv2426c ← / ← proA LuxR family transcriptional regulator Rv2488c ← PE-PGRS family protein PE_PGRS43 ← PE-PGRS family protein PE_PGRS43 ← hypothetical protein Rv2561 → hypothetical protein/hypothetical protein Rv2630 → / → Rv2631 hypothetical protein Rv2706c ← hypothetical protein Rv2879c ← hypothetical protein Rv2975c ← PE family protein/transposase PE29 ← / ← Rv3023c glutaredoxin electron transport protein NrdH/hypothetical protein nrdH ← / ← Rv3054c ATP-dependent DNA ligase ligB → hypothetical protein Rv4008 → two component sensory transduction transcriptional regulatory protein MTRA/thymidylate kinase mtrA ← / ← tmk arylsulfatase AtsB atsB ← PPE family protein PPE54 ← PE-PGRS family protein/hypothetical protein PE_PGRS50 ← / ← Rv3346c cholesterol oxidase precursor choD ← transposase Rv3428c ← 50S ribosomal protein L13/ESAT-6 like protein ESXT rplM ← / ← esxT transmembrane protein Rv3453 → peroxidase BpoA bpoA ← hypothetical protein Rv3594 → hypothetical protein Rv3655c ← hypothetical protein Rv3655c ← cutinase precursor cut5a → PPE family protein PPE66 ← excisionase/integrase Rv3750c ← / → Rv3751 AraC family transcriptional regulator Rv3833 → transcriptional regulatory protein WHIB-like WHIB6/hypothetical protein whiB6 ← / → Rv3863 hypothetical protein Rv3879c ← hypothetical protein Rv3901c ← RNA polymerase sigma factor SigM sigM → acyl-CoA synthetase fadD17 → hypothetical protein Rv1883c ← hypothetical protein Rv2561 → hypothetical protein Rv2802c ← transposase/resolvase Rv0920c ← / → Rv0921 hypothetical protein Rv2205c ← hypothetical protein/transposase Rv2813 → / ← Rv2814c PPE family protein PPE69 ← oxidoreductase Rv0897c ← adhesion component transport transmembrane protein ABC transporter Rv0987 → aminotransferase/hypothetical protein Rv2294 → / → Rv2295 NAD-dependent glutamate dehydrogenase gdh ← hypothetical protein Rv3488 → hypothetical protein Rv3879c ← methoxy mycolic acid synthase mmaA3 ← oxidase Rv0846c ← hypothetical protein Rv1371 → glycosyltransferase Rv1500 → cytochrome' transport transmembrane ATP-binding protein ABC transporter CydC cydC ← PPE family protein/secreted ESAT-6 like protein ESXR (TB10.3) (ESAT-6 like protein 9) PPE46 ← / ← esxR esterase/lipase LipF/hypothetical protein lipF ← / → Rv3488 PE-PGRS family protein PE_PGRS58 ← PE-PGRS family protein PE_PGRS58 ← hypothetical protein/hypothetical protein Rv3642c ← / → Rv3643 hypothetical protein Rv3902c ← hypothetical protein Rv0293c ← hypothetical protein Rv3529c ← hypothetical protein Rv0690c ← hypothetical protein/PE family protein Rv1194c ← / → PE13 lipoprotein LprF lprF → fatty-acid-CoA ligase fadD34 → MCE-family protein MCE2D mce2D → PE-PGRS family protein PE_PGRS46 ← hypothetical protein/transposase Rv2813 → / ← Rv2814c phosphate transporter PhoU phoY1 ← PPE family protein PPE54 ← by short-chain dehydrogenase Rv3485c ← Downloaded from www.microbiologyresearch.org IP: 204.44.112.220 On: Mon, 16 May 2016 06:39:41