pathogen Xylella fastidiosa - CTC-USP

1 downloads 0 Views 543KB Size Report
Jul 13, 2000 - twigs collected in Macaubal (Sa˜o Paulo, Brazil) on May 21, 1992 ...... RNA processing Cell structure Undefined category Cellular processes.
articles

The genome sequence of the plant pathogen Xylella fastidiosa The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis*, Sa˜o Paulo, Brazil * A full list of authors appears at the end of this paper

............................................................................................................................................................................................................................................................................

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis—a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium–bacterium and bacterium–host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer. Citrus variegated chlorosis (CVC), which was first recorded in Brazil in 1987, affects all commercial sweet orange varieties1. Symptoms include conspicuous variegations on older leaves, with chlorotic areas on the upper side and corresponding light brown lesions, with gum-like material on the lower side. Affected fruits are small, hardened and of no commercial value. A strain of Xylella fastidiosa was first identified as the causal bacterium in 1993 (ref. 2) and found to be transmitted by sharpshooter leafhoppers in 1996 (ref. 3). CVC control is at present limited to removing infected shoots by pruning, the application of insecticides and the use of healthy plants for new orchards. In addition to CVC, other strains of X. fastidiosa cause a range of economically important plant diseases including Pierce’s disease of grapevine, alfalfa dwarf, phony peach disease, periwinkle wilt and leaf scorch of plum, and are also associated with diseases in mulberry, pear, almond, elm, sycamore, oak, maple, pecan and coffee4. The triply cloned X. fastidiosa 9a5c, sequenced here, was derived from the pathogenic culture 8.1b obtained in 1992 in Bordeaux (France) from CVC-affected Valencia sweet orange twigs collected in Macaubal (Sa˜o Paulo, Brazil) on May 21, 1992 (ref. 2). Strain 9a5c produces typical CVC symptoms on inoculation into experimental citrus plants5, and into Nicotiana tabacum (S. A. Lopes, personal communication) and Catharantus roseus (P. BrantMonteiro, personal communication)—two novel experimental hosts.

General features of the genome The basic features of the genome are listed in Table 1 and a detailed map is shown in Fig. 1. The conserved origin of replication of the large chromosome has been identified in a region between the putative 50S ribosomal protein L34 and gyrB genes containing dnaA, dnaN and recF6. The Escherichia coli DnaA box consensus sequence TTATCCACA is found on both DNA strands close to dnaA. In addition, there are typical 13-nucleotide (ACCACCACCACCA) and 9-nucleotide (two TTTCATTGG and two TTTTATATT) sequences in other intergenic sequences of this region. This region is coincident with the calculated GC-skew signal inversion7. We have designated base 1 of the X. fastidiosa genome as the first Tof the only TTTTAT sequence found between the ribosomal protein L34 gene and dnaA. The overall percentage of open reading frames (ORFs) for which a putative biological function could be assigned (47%) was slightly below that for other sequenced genomes such as Thermotoga maritima8 (54%), Deinococcus radiodurans9 (52.5%) and Neisseria NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

meningitidis10 (53.7%). This may reflect the lack of previous complete genome sequences from phytopathogenic bacteria. Plasmid pXF1.3 contains only two ORFs, one of which encodes a replication-associated protein. Plasmid pXF51 contains 64 ORFs, of which 5 encode proteins involved in replication or plasmid stability and 20 encode proteins potentially involved in conjugative transfer. One ORF encodes a protein similar to the virulenceassociated protein D (VapD), found in many other bacterial pathogens11. Four regions of pXF51 present significant DNA similarity to parts of transposons found in plasmids from other bacteria, suggesting interspecific horizontal exchange of genetic material. The principal paralogous families are summarized in Table 2. The complete list of ORFs with assigned function is shown in Table 3. Seventy-five proteins present in the 21 completely sequenced genomes in the COG database12 (as of 15th March 2000) were also found in X. fastidiosa. Each of these sequences was used to Table 1 General features of the Xylella fastidiosa 9a5c genome Main chromosome Length (bp) G+C ratio Open reading frames (ORFs) Coding region (% of chromosome size) Average ORF length (bp) ORFs with functional assignment ORFs with matches to conserved hypothetical proteins ORFs without significant data base match Ribosomal RNA operons

tRNAs tmRNA

2,679,305 52.7% 2,782 88.0% 799 1,283 310 1,083 2 (16SrRNA-Ala-TGC-tRNA-Ile-GATtRNA-23SrRNA-5SrRNA) 49 (46 different sequences corresponding to all 20 amino acids) 1

.............................................................................................................................................................................

Plasmid pXF51 Length (bp) G+C ratio Open reading frames (ORFs) Protein coding region (% of plasmid size) ORFs with functional assignment ORFs with matches to conserved hypothetical proteins ORFs without significant data base match

51,158 49.6% 64 86.9% 30 8 24

.............................................................................................................................................................................

Plasmid pXF1.3 Length (bp) G+C ratio Open reading frames (ORFs) ORFs with functional assignment

1,285 55.6% 2 1

.............................................................................................................................................................................

© 2000 Macmillan Magazines Ltd

151

articles Table 2 Largest families of paralogous genes Family (total number of families = 312)

Number of genes (total number of genes = 853)

.............................................................................................................................................................................

ATP-binding subunits of ABC transporters Reductases/dehydrogenases Two-component system, regulatory proteins Hypothetical proteins Transcriptional regulators Fimbrial proteins Two-component system, sensor proteins

23 12 12 10 9 9 9

.............................................................................................................................................................................

generate a phylogenetic tree of the 22 organisms. In 69% of such trees, X. fastidiosa was grouped with Haemophilus influenzae and E. coli, consistent with a phylogenetic analysis undertaken with the 16S rRNA gene13. One ORF, a cytosine methyltransferase (XF1774), is interrupted by a Group II intron. The intron was identified on the basis of the presence of a reverse transcriptase-like gene (as in other Group II introns), conserved splice sites, conserved sequence in structure V and conserved elements of secondary structure14. Group II introns are rare in prokaryotes, but have been found in different evolutive lineages including E. coli, cyanobacteria and proteobacteria15.

Transcription, translation and repair The basic transcriptional and translational machinery of X. fastidiosa is similar to that of E. coli16. Recombinational repair, nucleotide and base-excision repair, and transcription-coupled repair are present with some noteworthy features. For example, no photolyase was found, indicating exclusively dark repair. Although the main genes of the SOS pathway, recA and lexA, are present, ORFs corresponding to the three DNA polymerases induced by SOS in E. coli (DNA polymerases II, IVand V)17 are missing, indicating that the mutational pathway itself may be distinct.

Energy metabolism Even though X. fastidiosa is, as its name suggests, a fastidious organism, energy production is apparently efficient. In addition to all the genes for the glycolytic pathway, all genes for the tricarboxylic acid cycle and oxidative and electron transport chains are present. ATP synthesis is driven by the resulting chemiosmotic proton gradient and occurs by an F-type ATP synthase. Fructose, mannose and glycerol can be utilized in addition to glucose in the glycolytic pathway. There is a complete pathway for hydrolysis of cellulose to glucose, consisting of 1,4-b-cellobiosidase, endo-1,4-b-glucanase and b-glucosidase, suggesting that cellulose breakdown may supplement the often low concentrations of monosaccharides in the xylem18. Two lipases are encoded in the genome, but there is no b-oxidation pathway for the hydrolysis of fatty acids, presumably precluding their utilization as an alternative carbon and energy source. Likewise, although enzymes required for the breakdown of threonine, serine, glycine, alanine, aspartate and glutamate are present, pathways for the catabolism of the other naturally occurring amino acids are incomplete or absent. The gluconeogenesis pathway appears to be incomplete. Phosphoenolpyruvate carboxykinase and the gluconeogenic enzyme fructose-1,6-bisphosphatase, which are required to bypass the irreversible step in glycolysis, are not present. The absence of the first is compensated by the presence of phosphoenolpyruvate synthase and malate oxidoreductase, which together can generate phosphoenolpyruvate from malate. There appears, however, to be no known compensating pathway for the absence of fructose-1,6bisphosphatase. It is possible that among the large number of unidentified X. fastidiosa genes there are non-homologous genes that compensate for steps in such critical pathways. Barring this possibility, however, the absence of a functional gluconeogenesis pathway implies a strict dependence on carbohydrates both as a 152

source of energy and anabolic precursors. The glyoxylate cycle is absent and the pentose phosphate pathway is incomplete. In the latter pathway, genes for neither 6-phosphogluconic dehydrogenase nor transaldolase were identified.

Small molecule metabolism X. fastidiosa exhibits extensive biosynthetic capabilities, presumably an absolute requirement for a xylem-dwelling bacterium. Most of the genes found in E. coli necessary for the synthesis of all amino acids from chorismate, pyruvate, 3-phosphoglycerate, glutamate and oxaloacetic acid16 were identified. However, some genes in X. fastidiosa are bi-functional, such as phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP pyrophosphatase (XF2213), aspartokinase/homoserine dehydrogenase I (XF2225), imidazoleglycerolphosphate dehydratase/histidinol-phosphate phosphatase (XF2217) and a new diaminopimelate decarboxylase/aspartate kinase (XF1116) that would catalyse the first and the last steps of lysine biosynthesis. In addition, the gene for acetylglutamate kinase (XF1001) has an acetyltransferase domain at its carboxy-terminal end that would compensate for the missing acetyltransferase in the arginine biosynthesis pathway. Other missing genes include phosphoserine phosphatase, cystathionine b-lyase, homoserine O-succinyltransferase and 2,4,5-methyltetrahydrofolate-homocysteine methyltransferase. The first two enzymes are also absent in the Bacillus subtilis genome, the third is absent in Haemophilus influenzae and the fourth is missing in both genomes12. We thus presume that alternative, unidentified enzymes complete the biosynthetic pathways in these organisms and in X. fastidiosa. The pathways for the synthesis of purines, pyrimidines and nucleotides are all complete. X. fastidiosa is also apparently capable of both synthesizing and elongating fatty acids from acetate. Again, however, some E. coli enzymes were not found, such as holo acylcarrier-protein synthase (also absent in Synechocystis sp., H. influenzae and Mycoplasma genitalium) and enoyl-ACP reductase (NADPH) (FabI) (also absent from M. genitalium, Borrelia burgdorferi and Treponema pallidum)12. X. fastidiosa appears to be capable of synthesizing an extensive variety of enzyme cofactors and prosthetic groups, including biotin, folic acid, pantothenate and coenzyme A, ubiquinone, glutathione, thioredoxin, glutaredoxin, riboflavin, FMN, FAD, pyrimidine nucleotides, porphyrin, thiamin, pyridoxal 59-phosphate and lipoate. In a number of the synthetic pathways, one or more of the enzymes present in E. coli are absent, but this is also true for at least one other sequenced Gram-negative bacterial genome in each case12. We therefore again infer that the missing enzymes are either not essential or replaced by unknown proteins with novel structures.

Transport-related proteins A total of 140 genes encoding transport-related proteins were identified, representing 4.8% of all ORFs. For comparison, E. coli, B. subtilis and M. genitalium have around 10% of genes encoding transport proteins, whereas Helicobacter pylori, Synechocystis sp. and Methanococcus jannaschii have 3.5–5.4% (ref. 19). Transport systems are central components of the host–pathogen relationship (Fig. 2). There are a number of ion transporters and transporters for the uptake of carbohydrates, amino acids, peptides, nitrate/nitrite, sulphate, phosphate and vitamin B12. Many different transport

Figure 1 Linear representation of the main chromosome and plasmids pXF51 and Q pXF1.3 of the Xylella fastidiosa genome. Genes are coloured according to their biological role. Arrows indicate the direction of transcription. Genes with frameshift and point mutations are indicated with an X. Ribosomal RNA genes, the tmRNA, the principal repeats, prophages and the group II intron are indicated by coloured lines. Transfer RNAs are identified by a single letter identifying the amino acid. Pie chart represents the distribution of the number of genes according to biological role. The numbers below protein-producing genes correspond to gene IDs.

© 2000 Macmillan Magazines Ltd

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

articles families are represented and include both small and large mechanosensitive conductance ion channels, a monovalent cation:proton antiporter (CAP-2) and a glycerol facilitator belonging to the major intrinsic protein (MIP) family. In addition, 23 ABC transport systems comprising 41 genes can be identified. X. fastidiosa appears to possess a phosphotransferase system (PTS) that typically mediates small carbohydrate uptake. There are both the enzyme I and HPr components of this system, as well as a gene supposedly involved in its regulation (pstK or hprK); however, there is no PTS permease—an essential component of the phosphotransferase complex. The functionality of the system therefore remains in question. There are five outer membrane receptors, including siderophores, ferrichrome-iron and haemin receptors, which are all associated with iron transport. The energizing complexes, TonB–ExbB–ExbD and the paralogous TolA–TolR–TolQ, essential for the functioning of the outer membrane receptors, are also present. In all, 67 genes encode proteins involved in iron metabolism. We propose that in X. fastidiosa the uptake of iron and possibly of other transition metal ions such as manganese causes a reduction in essential micronutrients in the plant xylem, contributing to the typical symptoms of leaf variegation. The X. fastidiosa genome encodes a battery of proteins that mediate drug inactivation and detoxification, alteration of potential drug targets, prevention of drug entry and active extrusion of drugs and toxins. These include ABC transporters and transport processes driven by a proton gradient. Of the latter, eight belong to the hydrophobe/amphiphile efflux-1 (HAE1) family, which act as multidrug resistance factors.

system controlling transcription of fimbrial subunits, presumably in response to host cues, and pilG, H, I, J and chpA, which encode a chemotactic system transducing environmental signals to the pilus machinery. In addition to the EPS and fimbriae, which are likely to have central roles in the clumping of bacteria and in adhesion to the xylem walls, we also identified outer membrane protein homologues for afimbrial adhesins. Although fimbrial adhesins are well characterized as crucial virulence factors in both plant and human pathogens25, afimbrial adhesins, which are directly associated with the bacterial cell surface, have been hitherto associated only with human and animal pathogens, where they promote adherence to epithelial tissue. Of the three putative adhesins of this kind identified in X. fastidiosa, two exhibit significant similarity to each other (XF1981, XF1529) and to the hsf and hia gene products of H. influenzae26. The third (XF1516) is similar to the uspA1 gene product of Moraxella catarrhalis27. All these proteins share the common C-terminal domain of the autotransporter family28. Direct experimentation will be required to establish whether these adhesins promote binding to plant cell structures or components of the insect vector foregut, or both. Nevertheless, their presence in the X. fastidiosa genome adds to the increasing evidence for the generality of mechanisms of bacterial pathogenicity, irrespective of the host organism29. We also identified three different haemagglutinin-like genes. Again, similar genes have not previously been identified in plant pathogens. These genes (XF2775, XF2196, XF0889) are the largest in the genome and exhibit highest similarity to a Neisseria meningitidis putative secreted protein10.

Adhesion

Intervessel migration

X. fastidiosa is characteristically observed embedded in an extracellular translucent matrix in planta20. Clumps of bacteria form within the xylem vessels leading to their blockage and symptoms of the disease such as water-stress leaf curling. We deduce, from our analysis of the complete genome sequence, that the matrix is composed of extracellular polysaccharides (EPSs) synthesized by enzymes closely related to those of Xanthomonas campestris pv campestris (Xcc) that produce what is commercially known as xanthan gum. In comparison with Xcc, however, we did not find gumI (encoding glycosyltransferase V, which incorporates the terminal mannose), gumL (encoding ketalase which adds pyruvate to the polymer) or gumG (encoding acetyltransferase which adds acetate), suggesting that Xylella gum may be less viscous than its Xanthomonas counterpart. Positive regulation of the synthesis of extracelullar enzymes and EPS in Xanthomomas is effected by proteins coded by the rpf (regulation of pathogenicity factors) gene cluster21. Mutations in any of these genes in Xanthomomas results in failure to synthesize the EPS. In consequence, the strain becomes non-pathogenic21. X. fastidiosa contains genes that encode RpfA, RpfB, RpfC and RpfF, suggesting that both bacteria may regulate the synthesis of pathogenic EPS factors through similar mechanisms. Fimbria-like structures are readily apparent upon electron microscopical observation of X. fastidiosa within both its plant and insect hosts22. Because of the high velocity of xylem sap passing through narrow portions of the insect foregut, fimbria-mediated attachment may be essential for insect colonization. Indeed, in the insect mouthparts the bacteria are attached in ordered arrays, indicating specific and polarized adhesion23. In addition, fimbriae are thought to be involved in both plant–bacterium and bacterium–bacterium interactions during colonization of the xylem itself. We identified 26 genes encoding proteins responsible for the biogenesis and function of Type 4 fimbria filaments. This type of fimbria is found at the poles of a wide range of bacterial pathogens where they act to mediate adhesion and translocation along epithelial surfaces24. The genes include pilS and pilR homologues, which encode a two-component

Movement between individual xylem vessels is crucial for effective colonization by X. fastidiosa. For this to occur, degradation of the pit membrane of the xylem vessel is required. Of the known pectolytic enzymes capable of this function, a polygalacturonase precursor and a cellulase were identified, although the former contains an authentic frameshift. These genes exhibited highest similarity to orthologues in Ralstonia solanacearum—which causes wilt disease in tomatoes—where the polygalacturonase genes are required for wild-type virulence.

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

Toxicity We identified five haemolysin-like genes: haemolysin III (XF0175), which belongs to an uncharacterized protein family, and four others (XF0668, XF1011, XF2407, XF2759) which belong to the RTX toxin family that contains tandemly repeated glycine-rich nonapeptide motifs at the C-terminal domain. One of these ORFs is closely related to bacteriocin, an RTX toxin also found in the plant bacterium Rhizobium leguminosarum30. RTX or RTX-like proteins are important virulence factors widely distributed among Gramnegative pathogenic bacteria31. There are two Colicin-V-like precursor proteins. Colicin V is an antibacterial polypeptide toxin produced by E. coli, which acts against closely related sensitive bacteria32. The precursors consist of 102-amino-acid peptides (XF0262, XF0263) that have the typical conserved leader 15-amino-acid motif, and have some similarity with Colicin V from E. coli at the remaining C-terminal portion. The necessary apparatus for Colicin biosynthesis and secretion is also present. Interestingly, in E. coli most of the genes necessary for biogenesis and export of Colicin V are in a gene cluster present in a plasmid, whereas in X. fastidiosa these genes are dispersed in the chromosome. We found four genes that may function in polyketide biogenesis: polyketide synthase (PKS), pteridine-dependent deoxygenase, daunorubicin C-13 ketoreductase and a NonF-related protein. These genes belong to the synthesis pathways of frenolicin, rapamycin, daunorubicin and nonactin, respectively. These pathways

© 2000 Macmillan Magazines Ltd

153

articles

Figure 2 A comprehensive view of the biochemical processes involved in Xylella fastidiosa pathogenicity and survival in the host xylem. The principal functional categories are shown in bold, and the bacterial genes and gene products related to that function are arranged within the coloured section containing the bold heading. Transporters are indicated as follows: cylinders, channels; ovals, secondary carriers, including the MFS family; paired dumbbells, secondary carriers for drug extrusion; triple dumbbells, ABC transporters; bulb-like icon, F-type ATP synthase; squares, other transporters. Icons with two arrows 154

represent symporters and antiporters (H+ or Na+ porters, unless noted otherwise). 2,5DDOL, 2,5-dichloro-2,5-cyclohexadiene-1,4-dol; EPS, exopolysaccharides; MATE, multi-antimicrobial extrusion family of transporters multidrug efflux gene (XF2686); MFS, major facilitator superfamily of transporters; Pbp, b-lactamase-like penicillin-binding protein (XF1621); RND, resistance-nodulation-cell division superfamily of transporters; ROS, reactive oxygen species.

© 2000 Macmillan Magazines Ltd

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

articles include many more enzymes, which we did not find; however, some of the genes listed lie close to ORFs without significant database matches, suggesting that at least one (as yet undiscovered) polyketide pathway may be functional.

Prophages Bacteriophages can mediate the evolution and transfer of virulence factors and occasional acquisition of new traits by the bacterial host. Because as much as 7% of the X. fastidiosa genome sequenced corresponds to double-stranded (ds) DNA phage sequences, mostly from the Lambda group, we suspect that this route may have been of particular importance for this bacterium. It is noteworthy that a very high percentage of phage-related sequences has also been detected in a second vascular-restricted plant pathogen, Spiroplasma citri33. We identified four regions, with a high density of ORFs homologous to phage sequences, that we considered to be prophages, in addition to isolated phage sequences dispersed throughout the genome. Two of these prophages (each ,42 kbp, designated XfP1 and XfP2) are similar to each other, lie in opposite orientations in distinct regions and appear to belong to the dsDNA, tailed-phage group. Both appear to contain most of the genes responsible for particle assembly, although we know of no reports of phage particle release from X. fastidiosa cultures. In prophage XfP1, we found two ORFs between tail genes V and W that are similar to ORF118 and vapA from the virulence-associated region of the animal pathogen Dichelobacter nodosus, which by homology encode a killer and a suppressor protein34. Interestingly, in prophage XfP2, we found two other ORFs also between tail genes V and W that are similar to hypothetical ORFs of Ralstonia eutropha transposon Tn4371 (ref. 35). The other two identified prophages, XfP3 and XfP4, are also similar in sequence to each other and to the H. influenzae cryptic prophage fflu (ref. 36). They both contain a 14,317-bp exact repeat. Few particle-assembly genes were found in these regions, suggesting that these prophages are defective. An ORF similar to hicB from H. influenzae, a component of the major pilus gene cluster in some isolates, was found in XfP4 (ref. 37). The presence of virulence-associated genes from other organisms within the prophage sequences is strong evidence for a direct role for bacteriophage-mediated horizontal gene transfer in the definition of the bacterial phenotype.

Absence of avirulence genes Phytopathogenic bacteria generally have a limited host range, often confined to members of a single species or genus. This specificity is defined by the products of the so-called avirulence (avr) genes present in the pathogen, which are injected directly into host cells, on infection, through a type III secretory system38–40. BLAST41 searches with all known avr and type III secretory system sequences failed to identify genes encoding proteins with significant similarities in the genome of X. fastidiosa. Although the variability of avr genes amongst bacteria might account for this apparent lack, the high level of similarity of some components of the type III secretory system argues against this. We suspect that these genes are, in fact, not required because of the insect-mediated transmission and vascular restriction of the bacterium that obviates the necessity of host cell infection. Furthermore, if the differing host ranges of X. fastidiosa are molecularly defined, this may be by a quite different mechanism not involving avr proteins.

Conclusions Before the elucidation of its complete genome sequence, very little was known of the molecular mechanisms of X. fastidiosa pathogenicity. Indeed, this bacterium was probably the least characterized of all organisms that have been fully sequenced. Our complete genetic analysis has determined not only the basic metabolic and replicative characteristics of the bacterium, but also a number of potential NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

pathogenicity mechanisms. Some of these have not previously been postulated to occur in phytopathogens, providing new insights into the generality of these processes. Indeed, the availability of this first complete plant pathogen genome sequence will now allow the initiation of the detailed comparison of animal and plant pathogens at the whole-genome level. In addition, the information contained in the sequence should provide the basis for an accelerated and rational experimental dissection of the interactions between X. fastidiosa and its hosts that might lead to fresh insights into M potential approaches to the control of CVC.

Methods The sequencing and analysis in this project were carried out by a network of 34 biology laboratories and one bioinformatics centre. The network is called the Organization for Nucleotide Sequencing and Analysis (ONSA)42, and is entirely located in the state of Sa˜o Paulo, Brazil.

Sequencing and assembly The sequence was generated using a combination of ordered cosmid and shotgun strategies43. A cosmid library was constructed, providing roughly 15-fold genome coverage, containing 1,056 clones with average insert size of 40 kilobases (kb). High-density colony filters of the library were made, and a physical map of the genome was constructed using a strategy of hybridization without replacement44. A total of 113 cosmid clones was selected for sequencing on the basis of the hybridization map and end-sequence analysis. The cosmid sequences were assembled into 15 contigs covering 90% of the genome. Additionally, shotgun libraries with different insert sizes (0.8–2.0 kb and 2.0–4.5 kb) were constructed from nebulized or restricted genomic DNA cloned into plasmids, and sequenced to achieve a 3.74-fold coverage of high-quality sequence (29,140 reads). Most of the sequencing was performed with BigDye terminators on ABI Prism 377 DNA sequencers. Cosmid and shotgun sequences were assembled into six contigs. We identified sequence gaps by linking information from forward and reverse reads, and closed either by primer walking or insert subcloning. The remaining physical gaps were closed by combinatorial PCR and by lambda clones selected from a lDash library by end-sequencing. The collinearity between the genome and the obtained sequence was confirmed by digestion of genomic DNA with AscI, NotI, SfiI, SmiI and SrfI, followed by comparison of the digestion pattern with the electronic digestion of the generated sequence. In addition, sequences from both ends of most cosmid clones and 236 l clones were used to confirm the orientation and integrity of the contigs. The sequence was assembled using phred+phrap+consed45. All consensus bases have quality with Phred value of at least 20. There are no unexplained high quality discrepancies, each consensus base is confirmed by at least one read from each strand, and the overall error estimate is less than 1 in every 10,000 bases.

ORF prediction and annotation ORFs were determined using glimmer 2.0 (ref. 46) and the glimmer post-processor RBSfinder (S. L. Salzberg, personal communication). A few ORFs were found by hand guided by BLAST41 results. Annotation was carried out in a cooperative way, mostly by comparison with sequences in public databases, using BLAST41 and tRNAscan-SE (ref. 47) and was based on the functional categories for E. coli48. Only one tmRNA was located (K. Williams, personal communication). To help annotate transport proteins, we built a custom BLAST41 database using sequences from http://www-biology.ucsd.edu/,msaier/ transport/toc.html and compared our ORFs with these sequences. Phylogenetic trees for conserved COGs12 were built using ClustalX49 for multiple alignment and Phylip50. Paralogous gene families (Table 2) were determined using BLASTX with the E-value cut-off equal to e-5 and such that at least 60% of the query sequence and at least 30% of the subject sequence were aligned. Received 24 March; accepted 24 May 2000. 1. Rosseti, V. et al. Pre´sence de bacte´ries dans le xyle`me d’orangers atteints de chlorose varie´ge´e, une nouvelle maladie des agrumes au Bre´sil. C. R. Acad. Sci. Paris se´rie III, 310, 345–349 (1990). 2. Chang, C. J. et al. Culture and serological detection of the xylem-limited bacterium causing citrus variegated chlorosis and its identification as a strain of Xylella fastidiosa. Curr. Microbiol. 27, 137–142 (1993). 3. Roberto, S. R., Coutinho, A., De Lima, J. E. O., Miranda, V. S. & Carlos, E. F. Transmissa˜o de Xylella fastidiosa pelas cigarrinhas Dilobopterus costalimai, Acrogonia terminalis e Oncometopia facialis em citros. Fitopatol. Bras. 21, 517–518 (1996). 4. Purcell, A. H. & Hopkins, D. L. Fastidious xylem-limited bacterial plant pathogens. Annu. Rev. Phytopathol. 34, 131–151 (1996). 5. Li, W. B. et al. A triply cloned strain of Xylella fastidiosa multiplies and induces symptoms of citrus variegated chlorosis in sweet orange. Curr. Microbiol. 39, 106–108 (1999). 6. Ye, F., Renaudin, J., Bove´, J. M. & Laigret, F. Cloning and sequencing of the replication origin (oriC) of the Spiroplasma citri chromosome and construction of autonomously replicating artificial plasmids. Curr. Microbiol. 29, 23–29 (1994). 7. Francino, M. P. & Ochman, H. Strand asymmetries in DNA evolution. Trends Genet. 13, 240–245 (1997). 8. Nelson, K. E. et al. Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399, 323–329 (1999).

© 2000 Macmillan Magazines Ltd

155

articles 9. White, O. et al. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science 286, 1571–1577 (1999). 10. Tettelin, H. et al. Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science 287, 1809–1815 (2000). 11. Katz, M. E., Strugnell, R. A. & Rood, J. I. Molecular characterization of a genomic region associated with virulence in Dichelobacter nodosus. Infect. Immun. 60, 4586–4592 (1992). 12. Tatusov, R. L., Galperin, M. Y., Natale, D. A. & Koonin, E. V. The COG database: a tool for genome scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000). 13. Preston, G. M., Haubold, B. & Rainey, P. B. Bacterial genomics and adaptation to life on plants: implications for the evolution of pathogenicity and symbiosis. Curr. Opin. Microbiol. 1, 589–597 (1998). 14. Knoop, V., Kloska, S. & Brennicke, A. On the identification of group II introns in nucleotide sequence data. J. Mol. Biol. 242, 389–396 (1994). 15. Ferat, J. L. & Michel, F. Group II self-splicing introns in bacteria. Nature 364, 358–361 (1993). 16. Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1474 (1997). 17. Bridges, B. A. DNA repair: Polymerases for passing lesions. Curr. Biol. 9, R475–R477 (1999). 18. Brodbeck, B. V., Andersen, P. C. & Mizell, R. F. Effects of total dietary nitrogen and nitrogen form on the development of xylophagous leafhoppers. Arch. Insect Biochem. Physiol. 42, 37–50 (1999). 19. Paulsen, I. T., Sliwinski, M. K. & Saier, M. H. J. Microbial genome analyses: global comparisons of transport capabilities based on phylogenies, bioenergetics and substrate specificities. J. Mol. Biol. 277, 573–592 (1998). 20. Chagas, C. M., Rossetti, V. & Beretta, M. J. G. Electron-microscopy studies of a xylem-limited bacterium in sweet orange affected with citrus variegated chlorosis disease in Brazil. J. Phytopathol. 134, 306–312 (1992). 21. Tang, J. L. et al. Genetic and molecular analysis of a cluster of rpf genes involved in positive regulation of synthesis of extracellular enzymes and polysaccharide in Xanthomonas campestris pathovar campestris. Mol. Gen. Genet. 226, 409–417 (1991). 22. Raju, C. B. & Wells, J. M. Diseases caused by fastidious xylem-limited bacteria. Plant Disease 70, 182– 186 (1986). 23. Brlansky, R. H., Timmer, I. W., French, W. J. & McCoy, R. E. Colonization of the sharpshooter vectors, Oncometopia igricans and Homalodisca coagulata, by xylem-limited bacteria. Phytopathology 73, 530– 535 (1983). 24. Fernandez, L. A. & Berenguer, J. Secretion and assembly of regular surface structures in Gram-negative bacteria. FEMS Microbiol. Rev. 24, 21–44 (2000). 25. Soto, G. E. & Hultgren, S. J. Bacterial adhesins: common themes and variations in architecture and assembly. J. Bacteriol. 181, 1059–1071 (1999). 26. Geme, J. W. Molecular determinants of the interaction between Haemophilus influenzae and human cells. Am. J. Respir. Crit. Care Med. 154, S192–S196 (1996). 27. Cope, L. D. et al. Characterization of the Moraxella catarrhalis uspA1 and uspA2 genes and their encoded products. J. Bacteriol. 181, 4026–4034 (1999). 28. Henderson, I. R., Navarro-Garcia, F. & Nataro, J. P. The great escape: structure and function of the autotransporter proteins. Trends Microbiol. 6, 370–378 (1998). 29. Rahme, L. G. et al. Common virulence factors for bacterial pathogenicity in plants and animals. Science 268, 1899–1902 (1995). 30. Oresnik, I. J., Twelker, S. & Hynes, M. F. Cloning and characterization of a Rhizobium leguminosarum gene encoding a bacteriocin with similarities to RTX toxins. Appl. Environ. Microbiol. 65, 2833–2840 (1999). 31. Lally, E. T., Hill, R. B., Kieba, I. R. & Korostoff, J. The interaction between RTX toxins and target cells. Trends Microbiol. 7, 356–361 (1999). 32. Havarstein, L. S., Holo, H. & Nes, I. F. The leader peptide of colicin V shares consensus sequences with leader peptides that are common among peptide bacteriocins produced by gram-positive bacteria. Microbiology 140, 2383–2389 (1994). 33. Ye, F. et al. A physical and genetic map of the Spiroplasma citri genome. Nucleic Acids Res. 20, 1559– 1565 (1992). 34. Billington, S. J., Johnston, J. L. & Rood, J. I. Virulence regions and virulence factors of the ovine footrot

pathogen, Dichelobacter nodosus. FEMS Microbiol. Lett. 145, 147–156 (1996). 35. Merlin, C., Springael, D. & Toussaint, A. Tn4371: A modular structure encoding a phage-like integrase, a Pseudomonas-like catabolic pathway, and RP4/Ti-like transfer functions. Plasmid 41, 40– 54 (1999). 36. Hendrix, R. W., Smith, M. C., Burns, R. N., Ford, M. E. & Hatfull, G. F. Evolutionary relationships among diverse bacteriophages and prophages: All the world’s a phage. Proc. Natl Acad. Sci. USA 96, 2192–2197 (1999). 37. Mhlanga-Mutangadura, T., Morlin, G., Smith, A. L., Eisenstark, A. & Golomb, M. Evolution of the major pilus gene cluster of Haemophilus influenzae. J. Bacteriol. 180, 4693–4703 (1998). 38. Alfano, J. R. & Collmer, A. The type III (Hrp) secretion pathway of plant pathogenic bacteria: trafficking harpins, Avr proteins, and death. J. Bacteriol. 179, 5655–5662 (1997). 39. Galan, J. E. & Collmer, A. Type III secretion machines: bacterial devices for protein delivery into host cells. Science 284, 1322–1328 (1999). 40. Young, G. M., Schmiel, D. H. & Miller, V. L. A new pathway for the secretion of virulence factors by bacteria: the flagellar export apparatus functions as a protein-secretion system. Proc. Natl Acad. Sci. USA 96, 6456–6461 (1999). 41. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997). 42. Simpson, A. J. & Perez, J. F. ONSA, the Sa˜o Paulo Virtual Genomics Institute. Nature Biotechnol. 16, 795–796 (1998). 43. Fleischmann, R. D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995). 44. Hoheisel, J. D. et al. High resolution cosmid and P1 maps spanning the 14 Mb genome of the fission yeast S. pombe. Cell 73, 109–120 (1993). 45. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998). 46. Delcher, A. L., Harmon, D., Kasif, S., White, O. & Salzberg, S. L. Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 27, 4636–4641 (1999). 47. Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequences. Nucleic Acids Res. 25, 955–964 (1997). 48. Riley, M. Functions of the gene products of Escherichia coli. Microbiol. Rev. 57, 862–952 (1993). 49. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882 (1997). 50. Felsenstein, J. PHYLIP—Phylogeny Inference Package (Version 3. 2). Cladistics 5, 164–166 (1989).

Supplementary information is available on Nature’s World-Wide Web site (http://www.nature.com), or on the author’s World-Wide Web site (http://www.lbi.ic.unicamp.br/xf) or as paper copy from the London editorial office of Nature.

Acknowledgements The consortium is indebted to J. F. Perez, Scientific Director of Fundac¸a˜o de Amparo a` Pesquisa do Estado de Sa˜o Paulo (FAPESP), for his strategic vision in creating and nurturing this project as well as C. A. de Pian and Juc¸ara Parra for their administrative coordination. We thank our Steering Committee: S. Oliver, A. Goffeau, J. Sgouros, A. C. M. Paiva and J. L. Azevedo for their critical accompaniment of the work. We also thank R. Fulton and P. Minx for their timely contribution and advice. Project funding was from FAPESP, the RHAE programme of the Conselho Nacional de Desenvolvimento Cientı´fico e Tecnolo´gico (CNPq) and Fundecitrus. For the full list of individuals who contributed to the completion of this project see (http://www.lbi. ic.unicamp.br/xf). Correspondence and requests for materials should be addressed to J.C.S. (e-mail: [email protected]). The sequence has been deposited in GenBank with accession numbers AE003849 (chromosome), AE003850 (pXF1.3) and AE003851 (pXF51).

Authors: A. J. G. Simpson1, F.C. Reinach2, P. Arruda3, F. A. Abreu4, M. Acencio5, R. Alvarenga2, L. M. C. Alves6, J. E. Araya7, G. S. Baia2, C. S. Baptista8, M. H. Barros8, E. D. Bonaccorsi2, S. Bordin9, J. M. Bove´10, M. R. S. Briones7, M. R. P. Bueno11, A. A. Camargo1, L. E. A. Camargo12, D. M. Carraro12, H. Carrer12, N. B. Colauto13, C. Colombo14, F. F. Costa9, M. C. R. Costa15, C. M. Costa-Neto16, L. L. Coutinho12, M. Cristofani17, E. Dias-Neto1, C. Docena2, H. El-Dorry2, A. P. Facincani6, A. J. S. Ferreira2, V. C. A. Ferreira18, J. A. Ferro6, J. S. Fraga4, S. C. Franc¸a19, M. C. Franco20, M. Frohme21, L. R. Furlan22, M. Garnier10, G. H. Goldman23, M. H. S. Goldman24, S. L. Gomes2, A. Gruber4, P. L. Ho25, J. D. Hoheisel21, M. L. Junqueira26, E. L. Kemper3, J. P. Kitajima27, J. E. Krieger26, E. E. Kuramae28, F. Laigret10, M. R. Lambais12, L. C. C. Leite25, E. G. M. Lemos6, M. V. F. Lemos29, S. A. Lopes19, C. R. Lopes13, J. A. Machado30†, M. A. Machado17, A. M. B. N. Madeira4, H. M. F. Madeira12†, C. L. Marino13, M. V. Marques8, E. A. L. Martins25, E. M. F. Martins18, A. Y. Matsukuma2, C. F. M. Menck8, E. C. Miracca5, C. Y. Miyaki11, C. B. Monteiro-Vitorello12, D. H. Moon20, M. A. Nagai5, A. L. T. O. Nascimento25, L. E. S. Netto11, A. Nhani Jr6, F. G. Nobrega8†, L. R. Nunes31, M. A. Oliveira32, M. C. de Oliveira33, R. C. de Oliveira31, D. A. Palmieri13, A. Paris13, B. R. Peixoto2, G. A. G. Pereira32, H. A. Pereira Jr6, J. B. Pesquero16, R. B. Quaggio2, P. G. Roberto19, V. Rodrigues34, A. J. de M. Rosa34, V. E. de Rosa Jr28, R. G. de Sa´34, R. V. Santelli2, H. E. Sawasaki14, A. C. R. da Silva2, A. M. da Silva2, F. R. da Silva3,27, W. A. Silva Jr15, J. F. da Silveira7, M. L. Z. Silvestri2, W. J. Siqueira14, A. A. de Souza17, A. P. de Souza3, M. F. Terenzi23, D. Truffi12, S. M. Tsai20, M. H. Tsuhako18, H. Vallada35, M. A. Van Sluys33, S. Verjovski-Almeida2, A. L. Vettore3, M. A. Zago15, M. Zatz11, J. Meidanis27 & J. C. Setubal27. Addresses: 1, Instituto Ludwig de Pesquisa sobre o Caˆncer, Rua Prof. Antonio Prudente,109 – 4o andar, 01509-010, Sa˜o Paulo - SP, Brazil; 2, Departamento de Bioquı´ mica, Instituto de Quı´ mica, Universidade de Sa˜o Paulo, Av. Prof. Lineu Prestes, 748, 05508-900, Sa˜o Paulo SP, Brazil; 3, Centro de Biologia Molecular e Engenharia Gene´tica, Universidade Estadual de Campinas, Caixa Postal 6010,13083-970, Campinas – SP, Brazil; 4, Laborato´rio de Biologia Molecular, Departamento de Patologia, Faculdade de Medicina Veterina´ria e Zootecnia, Universidade de Sa˜o Paulo, Av. Prof. Dr. Orlando Marques de Paiva, 87, 05508-000, Sa˜o Paulo – SP, Brazil; 5, Disciplina de Oncologia, Departamento de Radiologia, Faculdade de Medicina, Universidade de Sa˜o Paulo, Av. Dr. Arnaldo, 455, 01296-903, Sa˜o Paulo – SP, Brazil; 6, Departamento de Tecnologia, Faculdade de Cieˆncias Agra´rias e Veterina´rias de Jaboticabal, Universidade Estadual Paulista, Via 156

© 2000 Macmillan Magazines Ltd

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

articles de Acesso Prof. Paulo D. Castellane s/n, Km 5,14884-900, Jaboticabal – SP, Brazil; 7, Departamento de Microbiologia, Imunologia e Parasitologia, Escola Paulista de Medicina, Universidade Federal de Sa˜o Paulo, Rua Botucatu, 862, 04023-062, Sa˜o Paulo – SP, Brazil; 8, Departamento de Microbiologia, Instituto de Cieˆncias Biome´dicas, Universidade de Sa˜o Paulo, Av. Prof. Lineu Prestes,1374, 05508-900, Sa˜o Paulo – SP, Brazil; 9, Hemocentro, Faculdade de Cieˆncias Me´dicas, Universidade Estadual de Campinas,13083-97, Campinas – SP, Brazil; 10, Institut National de la Recherche Agronomique et Universite´ Victor Se´galen Bordeaux 2, 71 Avenue Edouard Bourleaux, Laboratoire de Biologie Cellulaire et Mole´culaire, 33883 Vilenave d’Ornon Cedex, France; 11, Departamento de Biologia, Instituto de Biocieˆncias, Universidade de Sa˜o Paulo, Rua do Mata˜o, 277, 05508-900, Sa˜o Paulo – SP, Brazil; 12, Escola Superior de Agricultura Luiz de Queiroz, Universidade de Sa˜o Paulo, Av. Pa´dua Dias,11,13418-900, Piracicaba – SP, Brazil; 13, Departamento de Gene´tica, Instituto de Biocieˆncias, Universidade Estadual Paulista, Distrito de Rubia˜o Junior,18618-000, Botucatu – SP, Brazil; 14, Centro de Gene´tica, Biologia Molecular e Fitoquı´ mica, Instituto Agronoˆmico de Campinas, Av. Bara˜o de Itapura,1481, Caixa Postal 28,13001-970, Campinas – SP, Brazil; 15, Departamento de Clı´ nica Me´dica, Faculdade de Medicina de Ribeira˜o Preto, Universidade de Sa˜o Paulo, Av. Bandeirantes, 3900,14049-900, Ribeira˜o Preto – SP, Brazil; 16, Departamento de Biofı´ sica, Escola Paulista de Medicina, Universidade Federal de Sa˜o Paulo, Rua Botucatu, 862, 04023-062, Sa˜o Paulo – SP, Brazil; 17, Centro de Citricultura Sylvio Moreira, Instituto Agronoˆmico de Campinas, Caixa Postal 04,13490-970, Cordeiro´polis – SP, Brazil; 18, Laborato´rio de Bioquı´ mica Fitopatolo´gica e Laborato´rio de Imunologia, Instituto Biolo´gico, Av. Cons. Rodrigues Alves,1252, 04014-002, Sa˜o Paulo – SP, Brazil; 19, Departamento de Biotecnologia de Plantas Medicinais, Universidade de Ribeira˜o Preto, Av. Costa´bile Romano, 2201,14096-380, Ribeira˜o Preto – SP, Brazil; 20,Centro de Energia Nuclear na Agricultura, Universidade de Sa˜o Paulo, Av. Centena´rio, 303, Caixa Postal 96,13400-970, Piracicaba – SP, Brazil; 21, Funktionelle Genomanalyse, Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 506, D-69120, Heidelberg, Germany; 22, Departamento de Melhoramento e Nutric¸a˜o Animal, Faculdade de Medicina Veterina´ria e Zootecnia, Universidade Estadual Paulista, Fazenda Lageado, Caixa Postal 560,18600-000, Botucatu - SP, Brazil; 23, Departamento de Cieˆncias Farmaceˆuticas , Faculdade de Cieˆncias Farmaceˆuticas de Ribeira˜o Preto, Universidade de Sa˜o Paulo, Av. do Cafe´ s/n,14040-903, Ribeira˜o Preto – SP, Brazil; 24, Departamento de Biologia, Faculdade de Filosofia, Cieˆncias e Letras de Ribeira˜o Preto, Universidade de Sa˜o Paulo, Av. Bandeirantes, 3900,14040-901, Ribeira˜o Preto – SP, Brazil; 25, Centro de Biotecnologia, Instituto Butantan, Av. Vital Brasil,1500, 05503-900, Sa˜o Paulo – SP, Brazil; 26, Laborato´rio de Gene´tica e Cardiologia Molecular/LIM 13, Instituto do Corac¸a˜o (InCor), Faculdade de Medicina, Universidade de Sa˜o Paulo, Av. Dr. Ene´as de Carvalho Aguiar, 44, 05403-000, Sa˜o Paulo – SP, Brazil; 27, Instituto de Computac¸a˜o, Universidade Estadual de Campinas, Caixa Postal 6176,13083-970, Campinas – SP, Brazil; 28, Departamento de Produc¸a˜o Vegetal, Faculdade de Cieˆncias Agronoˆmicas, Universidade Estadual Paulista, Fazenda Lageado, Caixa Postal 237,18603-970, Botucatu – SP, Brazil; 29, Departamento de Biologia Aplicada a` Agropecua´ria, Faculdade de Cieˆncias Agra´rias e Veterina´rias de Jaboticabal, Universidade Estadual Paulista, Via de Acesso Prof. Paulo Donato Castellane,14884-900, Jaboticabal – SP, Brazil; 30, Hospital do Caˆncer – A.C. Camargo, R. Antonio Prudente, 211, 01509-010, Sa˜o Paulo – SP, Brazil; 31, Nu´cleo Integrado de Biotecnologia, Universidade de Mogi das Cruzes, Av. Dr. Caˆndido Xavier de Almeida Souza, 200, 08780-911, Mogi das Cruzes – SP, Brazil; 32, Departamento de Gene´tica e Evoluc¸a˜o, Instituto de Biologia, Universidade Estadual de Campinas, Caixa Postal 6010,13083-970, Campinas– SP, Brazil; 33, Departamento de Botaˆnica, Instituto de Biocieˆncias, Universidade de Sa˜o Paulo, Rua do Mata˜o, 277, 05508-900, Sa˜o Paulo - SP, Brazil; 34, Departamento de Parasitologia, Microbiologia e Imunologia, Faculdade de Medicina de Ribeira˜o Preto, Universidade de Sa˜o Paulo, Av. Bandeirantes, 3900,14049-900, Ribeira˜o Preto – SP, Brazil; 35, Departamento de Psiquiatria, Instituto de Psiquiatria, Faculdade de Medicina, Universidade de Sa˜o Paulo, Rua Dr. Ovı´ deo Pires de Campos s/n, Sala 4051 (3o. andar), 05403-010, Sa˜o Paulo – SP, Brazil. † Present addresses: Novartis Seeds LTDA, Av. Prof. Vicente Rao, 90, 04706-900, Sa˜o Paulo – SP, Brazil (J. A. Machado); Centro de Cieˆncias Agra´rias e Ambientais, Pontifı´cia Universidade Cato´lica do Parana´, BR-376, Km 14, Caixa Postal 129, 83010-500, Sa˜o Jose´ dos Pinhais – PR, Brazil (H. M. F. Madeira); Instituto de Pesquisa e Desenvolvimento, Universidade do Vale do Paraı´ba, Av. Shishimi Hifumi, 2911, 12244-000, Sa˜o Jose´ dos Campos – SP, Brazil (F. G. Nobrega).

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

© 2000 Macmillan Magazines Ltd

157

270001

1

260

140

261 263 262

139

987

988

851

714

553

557

558

992

1530

1531

1402

853

559 560 XfP1

XfP4

726

561

1406

1266

1126

994

727

1535

860

2134

1979 1980

1

1

2

2

2687 2688

3

1285

2552

1981

2554

4 5 6

2689

2553

2418

7

2690

2555

2691

2419

2273 2275 2274

2138 2139

1269

9

10

569

1002

1133 1136 1134 1135

1001

2694 2695

2557

1988 1989

1851

1278

1137

12

1003

1990 1991

2696

2697 2698

2558

2426

2427

2285 2284

2428

2286

2699

2700

2559 2561 2562 2560

2283

1279

13

14

15

16

1855

1995

1856

17

1550

18

2701 2703 2702

1425

2571

2432

422

879

1427

1288

1724 1725 1726

2000 2002 2004 2001 2003

592

882

752

35

425

883

754

593

753

424

288 289

165

34

1727

1553

428

167

1554

1728

41

169

40

429

291

168

37 38 39

757

1025

886

26 27 29 30 28

2707 2708

2435 2436 2437

32 33 34 35

2709

36

2438

2575 2576 2577 2578

31

2434

2579

2302

1298

1730 1732 1731

37

38 39

2710

R 2581

44

294

1027 1028

1443

1735 1736 1737

40

2583

41

42

43

2311

44

1030

LR-7

1448 1449

2023

2024

2025

1889 1890

45 46

2714

2590

2448

47

48

2592

2450

49 50 51 52 53

2717 2720 2718 2719

2591

2449

tmRNA 2715 2716

2446 2447

54

55

2721

2593

2451

1749

1584 1586 1585

1452

890

1897

2722

59

60

2723

2596

61

2724

2597

62

1314

1182

2328

63

REPEAT-1

2598

64

2599

1455

1183

896

63

1187

1461

1462

621

51158

2725

2041 2042

2196

LR-7

2043

1904

2726

2602

2728

2605 2604

2727

2603

1763

2340

1 kb

1190 V D

1047

904 905

2608

2053

1914

1768

1609

2734

1474

2349

2206

472

796

1476

1477

1774 1775

2208

2060

2209

75

204

76 77

473

2061

1062

920

476 477

799

801

1063 1064

921 922 923

800

1924

2483 2485 2484

2742 2743 2745 2746 2744

2624

2486

2358

1927

2360

2361

1624 1625

A

2625

1931

2219

2628

2220

2629

928

1213

Frameshift/Point mutation (FSPM) Intron tRNA (letter indicates amino acid)

347

8485

212

86

87

348 351 349 350

213 214

929

1487

1351 1352

2498

1488

1353

810

933

657

88

352

215216 217 A

89

2083

2224

1078

1633 1634 1637 1635 1636

2500

2084

356

1223

820

1363

1497

1641

1499

1949

2385

2235

2092

2640 T Y G

2774

2641 26422643

824

106

825

673

232

368

M

107 109 110 108

1501

1502 1503

1372

1231

2236

2093

1807

2386

2647

2775

LR-7

2524

2237

953

826

955

827

954

674

676 677V

2238

1232

833

680

1234 K

1508

1106

963 964

834

681 682

118

965

966

835

683

523

1380

1381

374

1241

1382

1953

1814

2241 2242

2100

2526 2527

2776 2777

2651

2396

2244

2532 2533 2534

2397

2246 2247 2245

838

839

2250

2252 2253 2254

2111 2113 2112 2114

1962 1964 1963

1822

2251

1961

2536 2537

2538

378 380 379

2778

2779

2780

2781 2782

381

532 534 533 XfP1

1111

974

1112

975

841

1517

1384

1518

LR-3

382

535 536

2115

537

383

2677

2544

1386 1388 1387

1114

844

385 386 387

1389

1115

2258

2678

2545

1829

1830

2546

2408

2121

2679

2259

979 980

1527

1831 1832

2680

2547

2260

2548

2681

2549

1688

1528

1976

2550

2411

1529

391 392

1977

2551

2412 2413

2685

1692

1080000

945000

810000

675000

1215000

1350000

1755000

1890000

2025000

2687

2565000

2430000

2295000

2160000

2134

2552

2414

1485000

1620000

1979

2268

2686

2133

2267

405000

540000

1398

1258 1259

1978

135000

270000

L 1119 1120

985 986

848

712 713

550

1836 1837 1838

1689 1691 1690

1835

2262 2265 2266 2263 2264

2682 2683 2684

2409 2410

2261

1118

847

711

134

256 257 258

1392 1393 1394 1397 1395 1396

2122 2123 2125 2127 2128 2131 2124 2126 2129 2132 2130

1975

1833

984

390

255

1254 1255 1257 1256

1117

1834

1391

1253

710

549

982 981 983

846

707 709 708

1116

1390

706

1680 1681 1682 1683 1685 1687 1684 1686

978

845

133

254

388 389

253

132

540 542 544 546 548 541 543 545 547

1252

705

539

384

252

128 129 131 130

1969 1970 19711972 1973 1974

2120

1828

1967 1968

LR-6LR-6

2407

538

977

843

1676 1677 1679 1678

1827 1826

2116 2118 2117 2119

1966

1824 1825

2256 2257

2542 2543 2539 2540 2541

2255

1965

1823

1385

1250 1251

1113

976

842

127

244 245 247 249 250 251 246 248

125 126 S

1519 1521 1523 1524 1525 1526 1520 1522 REPEAT-IN-XfP3-AND-XfP4 XfP4

1245 1247 1248 1246 1249

973

840

1670 1672 1674 1675 1669 1671 1673

1244

1666 16671668

1516

1383

1242 1243

1110

2398 2399 2400 2402 2404 2406 2401 2405 2403

2249

377

527 528 529 530 531

243

124

686 687 689 691 693 695 696 697 699 701 703 704 688 692 694 698 700 702 690

LR-5

1108 1109

1821

2107C 2108 2110 2109

2535

2248

1960

1819 1820

123

2652 2655 2657 2659 2660 2662 2663 2664 2676 2666 2668 2670 2671 2672 2674 2653 2656 2658 2661 2667 2669 2673 2675 2665 2654 2679305

2528 2529 2530

1959

1818

2101 2103 2105 2106 2102 2104

2531

2243

1958

1815 1817 1816

1954 1956 1957 1955

2099

1813

122

240 241 242

972 V 968 969 970 971

685

526

376

120 121

239

375

837

684

967

1107

836

524 525

1509 1510 1511 1513 1514 1512 1515

1379

119

236 237 238

1235 1237 1239 1236 1238 1240

1104 1105

1376 1378 1377

1233

1103

2390 2392 2393 2395 2391 2394

2239 2240

1507

1375

679

522

373

235

116 117

1648 1649 1650 1653 1655 1656 1658 1659 1661 1663 1664 1665 1651 1654 1657 1660 1652 1662

2095 2096 2097 2098

1812

1647

2649 2650

2525

1373 1374

832

678

521

370 371 372

957 959 960 961 962 958

830 831 829

956

828

675

1504 1505 1506

2387 2389 2388

2648

2094

1952

1809 1811 1808 1810

1642 1643 1645 1646 1644

1500

2644 Q 2645 2646

1230

369

234

111 112 114 115 113

233

512 513 515 516 518 519 514 517 520

367

230 231

951 952

511

366

671 672

1369 1370 1371

1229

LR-5

1950 1951

1804 1805 S 1806

1498

823

670

510

947 948 949 950

509

105

228 229

104

1088 1090 1091 1093 1094 1096 1097 1098 1100 1102 1089 1092 1095 1099 1101

946

822

669

227

362 363 364 365

E

103

1364 1365 1366 1368 1367

2233 2234

2091

1948

2384

1947

1803

1087

945

821

508

1225 1227 1228 1226

944

226

359 361 360

225

100 101 102

507

358

224

2507 2510 2511 2512 2514 2516 2518 2521 2522 2523 2508 2513 2515 2517 2519 2509 2520 2636 2638 W 2637 2639

2506

2376 2379 2381 2383 2377 2380 2378 2382

2232

2090

1946

943

819

LR-4

506

1640

1800 1802 1801

1496

2763 2765 2767 27692770 2772 2764 2766 2768 2771 2773

2634 2635

2505

2375

2231

1945

2089

1944

1639

1799

1224

1362

99

223

1085 1084 1086 LR-1

942

1082 1083

941

818

668

357

505

222

94 95 96 97 98

817

1361

1494 1495

1360

1081

940

20872088

2760 P HK 2761 2762 R

2633

2501 2502 2504 2503

939

816

1798

1638

1797

2086

355

2226 2227 2229 2230 2228

2085

93

221

664 665 667 666

1079 1080

1221 1222

2370 2371 2372 2373 2374 XfP2

2225

663

1357 1358 1359

1220

354

92

497 499 501 503 504 498 500 502

936 937 938

1937 1938 1940 1942 1943 1939 1941

2367 2368 2369

2223

2759

2632

2499

2366

2222

91

220

814 813 815

178517861787 1788 1790 1792 1796 T 1794 1795 1789 1793 1791 1936

496

353

1489 1490 1491 1493 1492

1354 1355 1356

90

218 219

660 661 662

811 812

659

934 935

658

1077

1217 1219 1218

1075 1076 1074

1214 1216 1215

1073

930 932 931

807 809 808

656

1628 1629 1630 1631 1632

1486

1350

655

806

1072

654

1934 1935

2365

2221

2630 2631

2497

2362 2363 2364

2218

2747 2748 2750 2752 2753 2754 2756 2758 2749 2751 2755 2757

2626 2627

805

652 653

2069 2070 2073 2075 2077 2079 2081 2082 2071 2074 2076 2078 2080 2072

1932 1933

83

211

480 481 483 485 486 487 488 490 492 494 482 484 489 491 493 495

1626 1627

1485

1348 1349

82

343 344 346 345

210

1068 1069 1071 1070

927

1211 1212

1484

1210

926

803 804

651

479 N

342 341

209

81

1781 1782 1783 1784

1928 1930 1929

1777 1779 1778 1780

1623

1480 1482 1483 1481

1622

925

1347

2487 2488 2490 2492 2494 2496 2489 2491 2493 2495

2359

802

340

1065 1066 1067

924

1345 1346

2062 2064 2066 2068 2063 2065 2067

1925 1926

1776

1621

1479

1339 1341 1344 1340 1342 1343

1620

80

205 206 207 208

79

478

339

78

647 648 650 S 642 644 645 646 643 649

475

641

798

474

2211 2213 2214 2215 2216 2217 2210 2212

Transport Mobile genetic elements Pathogenicity and adaptation Conserved hypothetical Hypothetical

2741

74

1200 1201 1202 1204 1206 1209 1203 1205 1207 1208

1478

1338

1197 1198 1199

1615 1617 1618 1616 1619

1475

2618 2619 2620 2623 2621 2622

2482

2740

919

797

2350 2352 2354 2355 2357 2351 2353 2356

2207

2059

203

73

635 636 638 640 637 639

471

914 915 917 916 918

795

634

470

1917 1919 1920 1921 1923 1918 1922

2479 2480 2481

2739

1196

1772 1773 1771

202

324 325 326 327 328 330 332 335 338 331 333 336 329 334 337

LR-3

72

201

1056 1058 1060 A 1061 1055 1057 1059 L E

1611 1612 1614 1613

1770

1916

2617

1054

71

197 198 199 200

13311332 1333 1334 1335 1337 1336

2054 2057 2055 2058 2056

Macromolecule metabolism DNA processing RNA processing Cell structure Undefined category Cellular processes

2735 2736 2737 2738

1053

910

794

633

911 912 913

632

793

323

466 467 469 468

322

1471 1473 1472

1915

2477 2478

rRNA-operon

1192 1193 1195 1194

1769

2344 2346 2348 2345 2347

2475 2476

2343

A I

193 195 196 194

631

465

321

909

1610

1470

1330

2609 2610 2611 2613 2614 2615 2616 2612

2473 2474

2341 2342

908

2201 2202 2204 2205 2203

2050 2051 2052

907

1469

1766 1767

1911 1912 1913

1765

1606 1607 1608

1468

1191

1048

906

792

629 630

464

790 791

628

463

320

192

1049 1050 1052 1051

627

462

319

190 191

786 788 787 789

624 625 626

2197 2199 2198 2200

2049

1910

70

460 461

1327 1329 1328

903

1605

Intermediary metabolism Energy metabolism Regulatory functions Biosynthesis of small molecules Amino acid biosynthesis Nucleotide biosynthesis

2733

2607

623

785

459

189

318

1465 1466 1467

2472

2046 2047 2048

2467 2469 2470 2471 2468

2339

2044 2045

2729 2731 2730 2732

2606

2466

1761 1762

1764

1464

1906 1907 1908 1905 1909

2331 2332 2333 2334 2336 2338 2335 2337

2040

2461 2463 2465 2462 2464

2600 2601

2460

2330

2038 2039

1903

1754 1755 1757 1759 1756 1758 1760

1898 1899 1900 1901 1902

1753

1189

1046

317

187 188

69

68 R

458

901 902

622

186

67

1324 1325 1326

1463

1188

1319 1321 1322 1323 1320

316

782 783 784

456 457

315

620

900

66

184 185

65

1043 1045 1044

1186

1042

G

898 899

781

455

616 618 617 619

779 780

897

183

64

312 313 314

181 182

453 454

62

1458 1459 1460

1317 1318

L

1039 1041 1040

895

61

615

452

311

180

778

1184 1185

1456 1457

1316

179

451

1038

892 893 894

614

450

310

178

58 60 59

1588 1590 1592 1594 1596 1598 1600 1601 1602 1603 1604 1589 1591 1593 1595 1597 1599

2037

2329

613

777

1315

L

1037

891

2193 2195 2194

2454 2455 2456 2457 2459 2458

2327

2186 2188 2190 2192 2187 2189 2191

2326

2595

2453

56 57 58

REPEAT-1

2594

2452

2325

2185

612

776

1750 1751 1752

1587

1453

1313

309

177

54 56 55 57

308

176

53

446 447 449 448

1454

611

445

1181

1036

1892 1893 1894 1895 1896

1746 1747 1748

1583

1451

1312

1180

1032 1035 1033 1034

1310 1311

XfP3

610

442 444 443

609

175

52

307 L 305 304 306

174

51

769 770 772 773 775 771 774

608

440 441

303

50

2027 2028 2030 2032 2034 2035 2029 2031 2033 2026 2036

1891

1743 1744 1745

173

298 299 300 302 301

1575 1576 1577 1578 1580 1582 1579 1581

1742

2177 2178 2179 2181 2183 2184 2180 2182

2022

1888

1741

1450

49

172

1178 1179

1308 1309

REPEAT-IN-XfP3-AND-XfP4

1447

1305 1306 1307

1031

889

768

439

297

47 48

767 766

604 606 607 605

438

171

46

1173 1175 1176 1177 1174

1029

1571 1572 1574 1573

1740

603

437

296

170

762 763 764 765

602

436

295

45

2312 2313 2314 2316 2318 2320 2322 2324 2315 2317 2319 2321 2323

2176

2586 2587 2588 2589

2445

2713

2584 2585

2444

2307 2308 2309 G 2310

2170 2171 2173 2174 2175 2172

1887

1739

2020 2021

1738

1444 1446 1445

1304

761

600 601

888

760

435

rRNA-operon

433 434

A I

1303

1561 1562 1564 1566 1568 1569 1570 1563 1565 1567

1733 1734

2013 2015 2016 2018 2019 2014 2017

2711 2712

2582

2443

2306

2169

2010 2012 2011

2166 2167 2168

2009

2441 2442 2440

2580

2439

LR-2

887

599

1299 1300 1301 1302

1435 1437 1438 1439 1440 14411442 1436

1555 1557 1559 1556 1560 1558 1729

2303 2305 2304

2165

2005 2006 2008 2007

293 F

43

431 432

1026

759

430

292

42

1151 1153 1155 1157 1159 1161 1163 1165 1167 1169 1171 1172 1152 1154 1156 1158 1160 1164 1166 1168 1170 1162

885

758

595 596 597 598

756

884

755

594

426 427

290

166

36

1012 1014 1015 1016 1018 1020 1022 1024 1013 1017 1019 1021 1023

880 881

591

423

287

1428 1429 1431 1434 1430 1433 1432

2296 2298 2300 2301 2297 2299

2573 2574

2433

2295

2706

33

162 164 163

LR-1

1289 1290 1291 1292 1294 12951296 1297 1293

2159 2160 2161 2164 S 2156 2157 2158 2162 2163

2572

590

749 751 750

589

286

1146 1148 1150 1147 1149

1011

878 LR-4

876 877

1551 1552

1426

1287

1145

875

588

748

285

161

32

1859 1860 1863 1865 1867 1869 1870 1871 1873 1875 1876 1877 1878 1881 1882 1885 1861 1864 1866 1868 1872 1874 1879 1883 1886 1862 1880 1884

1996 1998 1999 1997

24 25

2705

19 20 22 21 23

2704

1424

1286

1144

874

31

157 159 158 160

30

585 586 587

282 284 283

156

28 29

743 745 747 746 744 P

584

1719 1721 1723 1720 1722

1857 K 1858

2431

2563 2564 2567 2568 2570 2565 2569 2566

2430

1143

1282 1283 1284 1285

1718

873

742

281

420 421

1006 1008 1010 1007 1009

870 872 871

2151 2152 2154 2155 2153

1994

280

154 155

27

577 580 581 583 578 582 579

741

1141 1142

1549

1423

1281

1005

869

740 738 739

576

2288 2289 2290 2291 2293 2292 2294

1993

1548

1717 G

26

152 153

278 279

151

22 25 23 M 24

414 416 417 419 415 418

277

1140

1280

1004

737

413

150

868

1138 1139

736

2429

2287

2149

1992

2150

1716

1852 1853 1854

2142 2143 2144 2145 2146 2147 2148

1987

1850

1712 1713 1715 1714

Plasmid pXF1.3 Gene Map

11

276

149

18 20 21 19

570 572 573 575 571 574

412

148

864 865 866 867

Plasmid pXF51 Gene Map

REPEAT-2 REPEAT-2

8

568

411

275

731 733 734 735 732

1132

1000

863

567

409 410

1270 1272 1274 1276 1271 1273 1277 1275

1131

2277 2279 2280 2282 2278 2281

2141

1986

1849

1985

999

730

566

408

274

147

17

1411 1413 1414 1415 1416 1418 1419 1422 1410 1412 1417 1420 1421

2421 2422 2423 2425 2424

2692 2693

2556

2420

2276

2140

1984

1845 1847 1848 1846

1982 1983

1843 1844

2137

2272

1842

2135 2136

1841

2270 2271

1838 1840 1839

146

1536 1538 1539 1540 1542 1545 1546 1537 1541 1544 1547 1543

1409

1268

1127 1128 1129 1130

145

268 269 271 273 270 272

144

861 862

728 729

996 998 997

1407 1408

1267

995

407

562 563 565 564

855 856 857 858 859

1533 1534

1405

1532

1403 1404

1265

993

854 P

LR-2

267

143

402 404 405 406 403

266

142

1693 1696 1699 1700 1703 1704 1705 1706 1707 1708 1709 1711 1694 1697 1701 1710 R 1695 1698 1702

1529

1125

989 990 991

852

1260 1261 1263 1262 1264

1398 1399 1401 1400

1259

554 556 555

401

264 265

141

715 717 718 720 723 724 725 716 719 721 722

1120 1121 1122 1124 1123

986

848 849 850

713

550 551 552

2414 2416 2417 T 2415

1

2565001

2430001

2295001

259

138

392 M 393 394 396 397 398 400 395 399 Q

258

2268 2269

2160001

2025001

1890001

1755001

1620001

1485001

1350001

1215001

1080001

945001

810001

675001

540001

405001

134 136 135 137

16

3.58

15

2.44

11 12 13 14

2.96

10

2.58

9

6.13

8

2.65

7

4.41

6

3.72

135001

2.89

Main Chromosome Gene Map

1.38

5

0.83

4

0.79

3

41.87

2

10.95

1

4.92

© 2000 Macmillan Magazines Ltd 4.61

158 3.27

1

articles

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

NATURE | VOL 406 | 13 JULY 2000 | www.nature.com

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Aerobic respiration

XF0347 D-lactate dehydrogenase (dld1) XF1802 glycerol-3-phosphate dehydrogenase (gpsA) XF2266 glycerol-3-phosphate dehydrogenase (glpD) XF0310 NADH-ubiquinone oxidoreductase, NQO1 subunit (nuoF) XF0314 NADH-ubiquinone oxidoreductase, NQO10 subunit (nuoJ) XF0315 NADH-ubiquinone oxidoreductase, NQO11 subunit (nuoK) XF0316 NADH-ubiquinone oxidoreductase, NQO12 subunit (nuoL) XF0317 NADH-ubiquinone oxidoreductase, NQO13 subunit (nuoM) XF0318 NADH-ubiquinone oxidoreductase, NQO14 subunit (nuoN) XF0309 NADH-ubiquinone oxidoreductase, NQO2 subunit (nuoE) XF0311 NADH-ubiquinone oxidoreductase, NQO3 subunit (nuoG) XF0308 NADH-ubiquinone oxidoreductase, NQO4 subunit (nuoD) XF0307 NADH-ubiquinone oxidoreductase, NQO5 subunit (nuoC) XF0306 NADH-ubiquinone oxidoreductase, NQO6 subunit (nuoB) XF0305 NADH-ubiquinone oxidoreductase, NQO7 subunit (nuoA) XF0312 NADH-ubiquinone oxidoreductase, NQO8 subunit (nuoH) XF0313 NADH-ubiquinone oxidoreductase, NQO9 subunit (nuoI)

34% 45%

37% 71% 41% 35%

XF2395 acetylxylan esterase (axeA) XF1250 arginine deaminase (rocF) XF1472 benzene 1,2-dioxygenase, ferredoxin protein (bedB) XF0840 beta-galactosidase (bga) XF0439 beta-glucosidase (bglX) XF0846 beta-mannosidase precursor XF1234 carboxyphosphonoenolpyruvate phosphonomutase (prpB) XF2210 dioxygenase XF1743 esterase (est) XF1610 fructokinase XF1740 glucose dehydrogenase B (yliI) XF1064 glucose kinase (glk) XF1460 glucose kinase (glk) XF1965 haloalkane dehalogenase (dhaA) XF2677 L-ascorbate oxidase (aao) XF1253 lipase (lipP) XF0781 lipase/esterase (estA) XF1825 NAD(P)H steroid dehydrogenase (cdh) XF1201 phosphoglycolate phosphatase (yfbT) XF2470 phosphoglycolate phosphatase (cbbZC) XF2259 polyvinylalcohol dehydrogenase XF0366 ribokinase (rbsK) XF0379 rubredoxin (rubA) XF1819 threonine dehydratase catabolic (tdcB) XF1181 triacylglycerol lipase precursor (lip) XF0610 UDP-glucose 4-epimerase (galE) XF2432 UTP-glucose-1-phosphate uridylyltransferase (gtaB)

XF1746 alcohol dehydrogenase (yahK) XF2389 alcohol dehydrogenase (yahK) XF1136 NADP-alcohol dehydrogenase XF1727 NADP-alcohol dehydrogenase (adh) XF1734 NADP-alcohol dehydrogenase

39%

41% 58%

57% 64%

76% 57% 63%

65% 54%

32% 55% 42% 33%

© 2000 Macm an Magaz nes L d

acetyl coenzyme A synthetase (acs) adenosylhomocysteinase (ahcY) carbonic anhydrase (yadF) carbonic anhydrase deoxyxylulose-5-phosphate synthase (dxs) ferredoxin-NADP reductase (fpr) glycerol kinase (glpK) glycine cleavage H protein (gcvH) glycine cleavage T protein (gcvT) glycine decarboxylase (gcvP) hydroxyacylglutathione hydrolase (gloB) inorganic pyrophosphatase (ppa) lactoylglutathione lyase (gloA) methionine adenosyltransferase (metK) tropinone reductase

61% 65% 46% 32% 55% 59% 59% 48% 55% 57% 45% 59% 63% 70% 43%

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/TCA cycle

XF0292 aconitate hydratase 2 (acnB)

40%

52%

72%

XF0869 dihydrolipoamide acetyltranferase (pdhB) 52% XF0868 dihydrolipoamide dehydrogenase (lpdA/lpd) 57% XF0669 pyruvate dehydrogenase (aceE) 56%

57% 55% 67%

XF1497 3’-phosphoadenosine 5’-phosphosulfate reductase (cysH) XF1501 ATP sulfurylase, large subunit (nodQ) XF1500 ATP sulfurylase, small subunit (cysD) XF1499 NADPH-sulfite reductase, flavoprotein subunit (cysJ) XF1498 NADPH-sulfite reductase, iron-sulfur protein (cysI)

XF1063 6-phosphogluconolactonase (pgl) 35% XF1065 glucose-6-phosphate 1-dehydrogenase (zwf) 46%

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Pyruvate dehydrogenase

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Oxidative branch, pentose pathway

84% 55%

68% 60% 56% 50% 48%

31% 63% 52% 79%

52%

48%

56%

46% 49% 56% 47% 70% 54% 40% 31% 30%

54%

51%

41%

63%

56%

69%

39% 34% 41%

32%

52% 39% 52%

84% 63%

50% 51%

XF0274 6-phosphofructokinase (pfkA) XF1291 enolase (eno) XF0826 fructose-bisphosphate aldolase XF0232 glucose-6-phosphate isomerase (pgi) XF0457 glyceraldehyde-3-phosphate dehydrogenase (gapA) XF0823 phosphoglycerate kinase (pgk) XF1893 phosphoglyceromutase (gpmA/gpm) XF0824 pyruvate kinase type II (pykA) XF0303 triosephosphate isomerase (tpiA/tpi)

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Glycolysis

XF2460 c-type cytochrome biogenesis membrane protein (cycK) XF2459 c-type cytochrome biogenesis protein (cycJ) XF2462 c-type cytochrome biogenesis protein (cycL) XF0620 c-type cytochrome biogenesis protein (copper tolerance) (dsbD) XF2461 c-type cytochrome biogenesis protein/thioredoxin (dsbE/ccmG) XF1328 cytochrome B561 (yodB) XF1360 cytochrome C oxidase assembly factor (coxD) XF1389 cytochrome O ubiquinol oxidase, subunit I (cyoB) XF1390 cytochrome O ubiquinol oxidase, subunit II (cyoA) XF1388 cytochrome O ubiquinol oxidase, subunit III (cyoC) XF1387 cytochrome O ubiquinol oxidase, subunit IV (cyoD) XF0253 electron transfer flavoprotein alpha subunit (etfA) XF0254 electron transfer flavoprotein beta subunit (etfB/etfS) XF1298 electron transfer flavoprotein ubiquinone oxidoreductase (etf-QO) XF0557 electron transfer protein azurin I (az1) XF0983 ferredoxin XF1964 ferredoxin (ykgJ) XF2601 ferredoxin XF0547 ferredoxin II (ydgM) XF0053 flavohemoprotein (fhp) XF2082 oxidoreductase XF1990 thioredoxin (yneN) XF0909 ubiquinol cytochrome C oxidoreductase, cytochrome B subunit (petB) XF0910 ubiquinol cytochrome C oxidoreductase, cytochrome C1 subunit (petC) XF0908 ubiquinol cytochrome C oxidoreductase, iron-sulfur subunit (petA)

81% 78% 56% 42% 56%

60%

56%

53%

87%

63%

58%

38%

41%

43%

50%

42%

69%

45%

53%

54%

46%

38%

45%

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Sulfur metabolism

XF0609 GDP-mannose 4,6 dehydratase (gmd) XF2279 nucleotide sugar epimerase (wbnF) XF0260 phosphoglucomutase/ phosphomannomutase (xanA) XF1468 phosphomannomutase (mrsA) XF0259 phosphomannose isomerase-GDPmannose pyrophosphorylase (xanB) XF1606 UDP-glucose dehydrogenase (ugd)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Sugar-nucleotide biosynthesis, conversions

XF2255 XF1037 XF0880 XF2095 XF2249 XF1889 XF2268 XF0181 XF0183 XF1385 XF2160 XF2171 XF1399 XF0392 XF0913

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Pool, multipurpose conversions

XF0657 alkaline phosphatase (phoA) XF0904 ATP-binding protein (ybeZ) XF2590 exopolyphosphatase (ppx) XF2591 polyphosphate kinase (ppk)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Phosphorus compounds

XF1288 CTP synthetase (pyrG) XF1058 uridylate kinase (pyrH)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Nucleotide interconversions

XF0208 D-ribulose-5-phosphate 3-epimerase (rpe/dod) XF2015 ribose-5-phosphate isomerase A (rpiA) XF1936 transketolase 1 (tktA/tkt)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Non-oxidative branch, pentose pathway

XF0977 malate oxidoreductase (maeB) XF1259 phosphoenolpyruvate synthase (ppsA/pps)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Gluconeogenesis

XF1061 2-keto-3-deoxy-6-phosphogluconate aldolase (eda/hga/kdgA) XF1062 6-phosphogluconate dehydratase (edd)

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Electron transport

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/Anaerobic respiration and fermentation

47% 45%

54%

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Entner-Douderoff

XF0141 glucosamine--fructose-6-phosphate aminotransferase (glmS) XF1464 glucosamine--fructose-6phosphateaminotransferase XF2355 N-acetyl-beta-glucosaminidase (exo) XF1465 N-acetylglucosamine-6-phosphate deacetylase (nagA)

INTERMEDIARY METABOLISM/CENTRAL INTERMEDIARY METABOLISM/Amino sugars

82%

62% 48% 46% 36% 44% 41% 35% 30% 29% 46% 25% 35% 35% 37% 22% 38% 54% 39% 56% 27%

XF0187 sulfite synthesis pathway protein (cysQ/amtA)

INTERMEDIARY METABOLISM/ DEGRADATION/Degradation of small molecules

ATP synthase, A chain (atpB/uncB/papD) ATP synthase, alpha chain (atpA/uncA) ATP synthase, B chain (atpF/uncF) ATP synthase, beta chain (atpD) ATP synthase, C chain (atpE/uncE) ATP synthase, delta chain (atpH/uncH) ATP synthase, epsilon chain (atpC) ATP synthase, gamma chain (atpG/uncG/papC)

57%

46% 74% 45% 78% 56% 37% 46%

36%

XF1241 aconitate hydratase 1 (acnA/acn) ---% XF1316 ATP:GTP 3’-pyrophosphotranferase (relA) 42% XF0125 carbon storage regulator (csrA) 84% XF2352 cold shock protein (scoF) 69% XF2476 extragenic supressor (suhB/ssyA) 44% XF2342 heat-inducible transcriptional repressor (hrcA) 40% XF0122 LexA repressor (lexA) 76% XF1182 lipase modulator (act) 53% XF1813 methanol dehydrogenase regulatory protein 53% XF1830 nitrile hydratase activator 46% XF1843 nitrogen regulatory protein P-II (glnB) 87% XF0352 pentaphosphate guanosine-3’pyrophosphohydrolase (spoT) 47% XF2145 phosphate regulon transcriptional regulator (phoU) 43% XF1275 poly(hydroxyalcanoate) granule associated protein (phaF) 20% XF2691 RNA polymerase sigma-32 factor (rpoH) 80% XF1408 RNA polymerase sigma-54 factor (rpoN) 64% XF1350 RNA polymerase sigma-70 factor (rpoD) 63% XF2239 RNA polymerase sigma-H factor (algU/algT) 56% XF1407 sigma-54 modulation protein 44% XF0911 stringent starvation protein A (sspA/ssp/pog) 46% XF0912 stringent starvation protein B (sspB) 42% XF2165 transcription-related protein (tex) 59% XF0962 transcriptional regulator (gcvR) 26% XF1749 transcriptional regulator (opdE) 77% XF1858 transcriptional regulator (exsB) 49% XF2038 transcriptional regulator (paiB) 31% XF2228 transcriptional regulator (algH) 45% XF2085 transcriptional regulator (AcrR family) 23% XF1254 transcriptional regulator (AraC family) (araL) 34% XF0767 transcriptional regulator (ArsR family) (hlyU) 34% XF1540 transcriptional regulator (Crp/Fnr family) (clp) 85% XF0821 transcriptional regulator (Fur family) (zur) 38% XF2344 transcriptional regulator (Fur family) (fur) 85% XF1463 transcriptional regulator (LacI family) 32% XF0972 transcriptional regulator (LuxR/UhpA family) (agmR/glpR) 38% XF2608 transcriptional regulator (LuxR/UhpA family) (gacA) 44% XF0833 transcriptional regulator (LysR family) (cysB) 39% XF1132 transcriptional regulator (LysR family) (act) 33% XF1730 transcriptional regulator (LysR family) (yafC) 35% XF1752 transcriptional regulator (LysR family) 80% XF1768 transcriptional regulator (LysR family) (ycjZ) 45% XF0216 transcriptional regulator (MarR family) (prsX) 22% XF1354 transcriptional regulator (MarR family) (yybA) 28% XF1490 transcriptional regulator (MarR/EmrR family) 26% XF1996 transcriptional regulator (PbsX family) 35% XF0061 transcriptional repressor (korB) 28% XF2062 transcriptional repressor (korC) 69% XF1920 Trp operon transcriptional repressor (trpR) 36% XF1094 tryptophan repressor binding protein (wrbA) 39% XF1133 tryptophan repressor binding protein 33% XF1733 tryptophan repressor binding protein 31% XF1455 two-component system, hybrid sensor/regulatory protein 28% XF0322 two-component system, regulatory protein (tctD) 45% XF0389 two-component system, regulatory protein (popP/feuP/phoP) 54% XF0450 two-component system, regulatory protein (pilH) 50% XF1113 two-component system, regulatory protein 30% XF1626 two-component system, regulatory protein (algR) 47% XF1848 two-component system, regulatory protein (glnG/ntrC/glnT) 51% XF2336 two-component system, regulatory protein (colR) 51% XF2534 two-component system, regulatory protein (colR) 59% XF2545 two-component system, regulatory protein (pilR) 58% XF2578 two-component system, regulatory protein (actR) 40% XF2593 two-component system, regulatory protein (phoB) 56% XF0323 two-component system, sensor protein (tctE) 31% XF0390 two-component system, sensor protein (phoQ)36%

acetylglutamate kinase (argB) acetylornithine deacetylase (argE) argininosuccinate lyase (asl) argininosuccinate synthase (argG) gamma-glutamyl phosphate reductase (proA) glutamate 5-kinase glutamate synthase, alpha subunit (gltB/aspB) glutamate synthase, beta subunit (gltD/aspB) glutamine synthetase (glnA) N-acetyl-gamma-glutamyl-phosphate reductase ornithine carbamoyltransferase (argF) pyrroline-5-carboxylate reductase (proC) succinylornithine aminotransferase (argM/astC/cstC)

XF2226 aspartate carbamoyltransferase (pyrB) XF1107 carbamoyl-phosphate synthase large chain (carB/pyrA) XF1106 carbamoyl-phosphate synthase small chain (carA) XF2439 cytidylate kinase (cmkA) XF0988 dihydroorotase (pyrC) XF2571 dihydroorotate dehydrogenase (pyrD) XF0153 orotate phosphoribosyl transferase (pyrE) XF0034 orotidine 5’-phosphate decarboxylase (pyrF)

37%

52% 38%

55% 57% 45% 34% 73% 42% 50% 33% 30% 34% 59%

33%

40%

59% 28% 62%

56% 47% 45% 60%

56%

cystathionine beta-synthase (cysB) cysteine synthase cysteine synthase (cysK) D-3-phosphoglycerate dehydrogenase (serA) phosphoserine aminotransferase (serC) serine hydroxymethyltransferase (glyA)

31% 27% 51% 62% 49% 71%

XF2216 amidotransferase (hisH) 43% XF2220 ATP phosphoribosyltransferase (hisG) 43% XF2214 cyclase (hisF) 61% XF2219 histidinol dehydrogenase (hisD) 50% XF2218 histidinol-phosphate aminotransferase (hisC) 42% XF2217 imidazoleglycerolphosphate dehydratase/histidinol-phosphate phosphatase bifunctional enzyme (hisB) 50%

BIOSYNTHESIS OF SMALL MOLECULES/AMINO ACIDS BIOSYNTHESIS/Histidine

XF1334 3-dehydroquinate synthase (aroB) 47% XF2324 3-phosphoshikimate 1-carboxyvinyltransferase (aroE) 46% XF0212 anthranilate phosphoribosyltransferase (trpD) 48% XF0210 anthranilate synthase component I (trpE) 57% XF0674 anthranilate synthase component I (trpE) 36% XF1914 anthranilate synthase component I (trpE) 37% XF0211 anthranilate synthase component II (trpG) 67% XF1915 anthranilate synthase component II (trpG) 43% XF0036 aromatic-amino-acid aminotranferase (tyrB) 55% XF0047 catabolic dehydroquinase (aroQ) 65% XF1141 chorismate mutase 25% XF2338 chorismate mutase 29% XF1369 chorismate synthase (aroC) 62% XF0213 indole-3-glycerol phosphate synthase (trpC) 63% XF1374 N-(5’-phosphoribosyl) anthranilate isomerase (trpF) 43% XF2325 P-protein (aroQ/pheA) 75% XF0026 phospho-2-dehydro-3-deoxyheptonate aldolase (aroG) 61% XF0624 shikimate 5-dehydrogenase (aroE) 43% XF1335 shikimate kinase (aroK) 40% XF1376 tryptophan synthase alpha chain (trpA) 46% XF1375 tryptophan synthase beta chain (trpB) 65%

BIOSYNTHESIS OF SMALL MOLECULES/AMINO ACIDS BIOSYNTHESIS/Aromatic amino acid family

XF0603 XF0128 XF0831 XF2206 XF2326 XF0946

59%

34% 70% 39%

44%

65% 45%

68% 48% 30% 51% 51% 41%

68%

52%

50%

53%

66%

50%

60%

64% 68%

59% 68% 64% 37% 52%

43% 47% 57% 59% 61%

39% 45% 70%

27%

42% 44% 59% 47%

37% 38%

44% 52%

58% 43% 50% 52%

35%

50%

XF1270 lipoate biosynthesis protein B (lipB) XF1269 lipoic acid synthetase (lipA/lip)

55% 60%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Lipoate

XF0228 2-amino-4-hydroxy-6hydroxymethyldihydropteridine pyrophosphokinase (folK) XF1456 2-amino-4-hydroxy-6hydroxymethyldihydropteridine pyrophosphokinase (folK) XF2431 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase (folD) XF2331 dihydrofolate reductase type III XF0436 dihydroneopterin aldolase (folB) XF0091 dihydropteroate synthase (folP) XF1946 folylpolyglutamate synthase/dihydrofolate synthase (folC/dedC) XF1983 GTP cyclohydrolase I (folE)

BIOSYNTHESIS OF SMALL MOLECULES/ COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Folic acid

XF1357 8-amino-7-oxononanoate synthase (bioF) 42% XF0189 adenosylmethionine-8-amino-7-oxononanoate aminotransferase (bioA) 47% XF1356 biotin biosynthesis protein (bioH/bioB) 36% XF0064 biotin synthase (bioB) 60% XF2099 biotin synthesis protein (bioC) 36% XF0356 cytochrome P-450 hydroxylase 37% XF0377 cytochrome P450-like enzyme (bioI) 37% XF2477 dethiobiotin synthetase (bioD) 44%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Biotin

XF1297 gluconolactonase precursor XF0608 mannosyltransferase (mtfA)

BIOSYNTHESIS OF SMALL MOLECULES/SUGARS AND SUGAR NUCLEOTIDES BIOSYNTHESIS

XF2089 5’-nucleotidase XF2150 diadenosine tetraphosphatase (apaH) XF0150 dUTPase (dut/dnaS/sof) XF1831 formyltetrahydrofolate deformylase (purU) XF2354 hypoxanthine-guanine phosphoribosyltransferase (hpt) XF2599 phosphodiesterase-nucleotide pyrophosphatase precursor XF2353 purine nucleoside phosphorylase XF2332 thymidylate synthase (thyA)

BIOSYNTHESIS OF SMALL MOLECULES/NUCLEOTIDES BIOSYNTHESIS/Salvage of nucleosides and nucleotides

XF0762 deoxycytidine triphosphate deaminase (dcd) XF1441 phosphohydrolase (orfU1) XF1196 ribonucleoside-diphosphate reductase alpha chain (nrdA) XF1197 ribonucleoside-diphosphate reductase beta chain (nrdB) XF1448 thioredoxin reductase (trxB) XF0580 thymidylate kinase

BIOSYNTHESIS OF SMALL MOLECULES/NUCLEOTIDES BIOSYNTHESIS/2’-Deoxyribonucleotides

BIOSYNTHESIS OF SMALL MOLECULES/NUCLEOTIDES BIOSYNTHESIS/Pyrimidine ribonucleotides

79% 78%

63%

XF0587 5’-phosphoribosyl-5-aminoimidazole synthetase (purM/purG) XF0585 5’-phosphoribosylglycinamide transformylase (purN) XF0275 adenylate kinase (adk) XF1553 adenylosuccinate lyase (purB) XF0455 adenylosuccinate synthetase (purA) XF1949 amidophosphoribosyltransferase (purF) XF1975 bifunctional purine biosynthesis protein (purH) XF2429 glutamine amidotransferase (guaA) XF1976 glycinamide ribonucleotide synthetase (purD) XF0560 GMP synthase XF1503 guanylate kinase (gmk/spoR) XF2430 inosine-5’-monophosphate dehydrogenase (guaB) XF0458 nucleoside diphosphate kinase (ndk) XF2644 phosphoribosyl pyrophosphate synthetase (prsA/prs) XF2671 phosphoribosylaminoimidazole carboxylase, ATPase subunit (purK) XF2672 phosphoribosylaminoimidazole carboxylase, catalytic subunit (purE) XF0205 phosphoribosylaminoimidazolesuccinocarboxamide synthase (purC) XF1423 phosphoribosylformylglycinamidine synthetase (purL/purI)

BIOSYNTHESIS OF SMALL MOLECULES/NUCLEOTIDES BIOSYNTHESIS/Purine ribonucleotides

XF2213 phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP pyrophosphatase bifunctional enzyme (hisI/hisIE) 56% XF2215 phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase (hisA) 44%

60% 47%

48%

33% 36% 49%

53% 55% 64%

30% 28% 35% 36% 51% 46%

BIOSYNTHESIS OF SMALL MOLECULES/AMINO ACIDS BIOSYNTHESIS/Glycine-serine family|sulfur metabolism

XF0114 2,3,4,5-tetrahydropyridine-2-carboxylate N-succinyltransferase (dapD) XF1818 2-isopropylmalate synthase (leuA) XF2375 3-isopropylmalate dehydratase large subunit (leuA) XF2374 3-isopropylmalate dehydratase small subunit (leuD) XF2372 3-isopropylmalate dehydrogenase (leuB) XF1121 5,10-methylenetetrahydrofolate reductase (metF) XF2272 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferase (metE) XF1821 acetolactate synthase isozyme II, large subunit (ilvG) XF1473 aminotransferase (nifS) XF2396 aminotransferase (aspC) XF0118 asparagine synthase B (asnB) XF1371 aspartate-B-semialdehyde dehydrogenase (asd) XF2100 aspartyl/asparaginyl beta-hydroxylase (aspH) XF2443 beta-alanine synthetase XF2225 bifunctional aspartokinase/homoserine dehydrogenase I (thrA/thrA1/thrA2) XF1116 bifunctional diaminopimelate decarboxylase/aspartate kinase (lysA) XF1999 branched-chain amino acid aminotransferase (ilvE) XF0864 cystathionine gamma-synthase (metB) XF1481 diaminopimelate epimerase (dapF) XF1105 dihydrodipicolinate reductase (dapB) XF0099 dihydroxy-acid dehydratase (ilvD) XF0963 dihydroxydipicolinate synthase (dapA) XF2211 enolase-phosphatase (masA) XF2224 homoserine kinase (thrB) XF0863 homoserine O-acetyltransferase (met2) XF2465 homoserine O-acetyltransferase XF1822 ketol-acid reductoisomerase (ilvC) XF0116 succinyl-diaminopimelate desuccinylase (dapE) XF2223 threonine synthase (thrC)

BIOSYNTHESIS OF SMALL MOLECULES/AMINO ACIDS BIOSYNTHESIS/Aspartate family, pyruvate family

XF0998 XF2712 XF1427

XF2709 XF1842 XF1002

71% 58%

69%

XF1001 XF1000 XF1003 XF0999 XF1005 XF1004 XF2710

38%

two-component system, sensor protein (phs) 37% two-component system, sensor protein (flhS) 35% two-component system, sensor protein (algZ) 40% two-component system, sensor protein (ntrB) 37% two-component system, sensor protein (colS) 34% two-component system, sensor protein (pilS) 37% two-component system, sensor protein (actS) 25% two-component system, sensor protein (phoR)38%

BIOSYNTHESIS OF SMALL MOLECULES/AMINO ACIDS BIOSYNTHESIS/Glutamate family|nitrogen assimilation

XF0853 XF0973 XF1625 XF1849 XF2535 XF2546 XF2577 XF2592

70%

53% 58% 30% 49% 75% 62% 57% 54%

61% 58%

INTERMEDIARY METABOLISM/REGULATORY FUNCTIONS

XF1149 XF1145 XF1147 XF1143 XF1148 XF1146 XF1142 XF1144

INTERMEDIARY METABOLISM/ENERGY METABOLISM, CARBON/ATP-proton motive force interconversion

XF1535 citrate synthase (gltA) XF1548 dihydrolipoamide dehydrogenase (lpd) XF1549 dihydrolipoamide S-succinyltransferase (sucB) XF1554 fumarate hydratase (fumC) XF1855 fumarate hydratase (fumB) XF2596 isocitrate dehydrogenase (icd) XF2700 isocitrate dehydrogenase (icd) XF1211 malate dehydrogenase (mdh) XF0942 malate:quinone oxidoreductase (yojH) XF1550 oxoglutarate dehydrogenase (odhA) XF1073 succinate dehydrogenase iron-sulfur protein (sdhB) XF1072 succinate dehydrogenase, flavoprotein subunit (sdhA) XF1070 succinate dehydrogenase, membrane anchor subunit (sdhC) XF1071 succinate dehydrogenase, membrane anchor subunit (sdhD) XF2548 succinyl-CoA synthetase, alpha subunit (sucD) XF2547 succinyl-CoA synthetase, beta subunit (sucC)

53% 42%

49% 50% 52%

33% 57% 67% 44%

46% 45% 41%

gamma-glutamyltranspeptidase (ggt) glutamate-cysteine ligase precursor (gsh1) glutaredoxin (grxC) glutaredoxin-like protein glutathione synthetase (gshB/gsh-II) thioredoxin (trxA) thioredoxin (ybbN) thioredoxin (trxA/tsnC/fipA)

39% 57% 53% 54% 53% 34% 31% 49%

63% 65% 48% 80% 76% 52% 20% 45% 45% 54% 27%

32%

XF0193 6-pyruvoyl tetrahydrobiopterin synthase (ygcM) XF1457 pteridine reductase 1 (ptr1/ltdH) XF2604 pterin-4-alpha-carbinolamine dehydratase (phhB) 34%

63% 37%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Biopterin

XF1886 phosphoglycerate mutase (pgmA)

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Cobalamin

XF0017 coproporphyrinogen III oxidase, aerobic (hemF) XF2306 delta-aminolevulinic acid dehydratase (hemB) XF0566 ferrochelatase (hemH) XF2302 glutamate-1-semialdehyde 2,1-aminomutase (hemL) XF2648 glutamyl-tRNA reductase (hemA) XF1627 hydroxymethylbilane synthase (hemC) XF1797 porphyrin biosynthesis protein (hemY) XF1512 protoporphyrinogen oxidase (hemK) XF0832 siroheme synthase (cysG) XF1332 uroporphyrinogen decarboxylase (hemE) XF1799 uroporphyrinogen-III synthase (hemD)

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Heme, porphyrin

XF0835 2-octaprenyl-6-methoxyphenol hydroxylase (ubiH/visB) 34% XF2471 3-demethylubiquinone-9 3-methyltransferase (ubiG/pufX) 49% XF0661 geranyltranstransferase (farnesyl-diphosphate synthase) (ispA) 54% XF0068 hydroxybenzoate octaprenyltransferase (ubiA/cyr) 52% XF1391 octaprenyl-diphosphate synthase (ispB/cel) 49% XF1538 ubiquinone biosynthesis protein (coq7) 29% XF1833 ubiquinone biosynthesis protein (aarF) 43% XF1487 ubiquinone/menaquinone transferase (ubiE) 58%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Menaquinone, ubiquinone

XF0984 XF1428 XF2595 XF2394 XF1956 XF1199 XF2174 XF2698

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Thioredoxin, glutaredoxin, glutathione

XF1748 5-amino-6-(5-phosphoribosylamino)uracil reductase 21% XF0954 6,7-dimethyl-8-ribityllumazine synthase (ribH) 45% XF0953 GTP cyclohydrolase II/3,4-dihydroxy-2-butanone 4-phosphate synthase (ribA) 52% XF1992 riboflavin biosynthesis protein (ribA) 35% XF2419 riboflavin biosynthesis protein (ribF) 43% XF0952 riboflavin synthase alpha chain (ribE) 45% XF0950 riboflavin-specific deaminase (ribD/ribG) 51%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Riboflavin

XF1048 1-deoxy-D-xylulose 5-phosphate reductoisomerase (dxr) XF0621 phosphomethylpyrimidine kinase XF0378 thiamin-phosphate pyrophosphorylase (thiE) XF0594 thiamine biosynthesis lipoprotein ApbE precursor (apbE) XF0783 thiamine biosynthesis protein (thiG) XF1888 thiamine biosynthesis protein (thiC/thiA) XF0956 thiamine-monophosphate kinase (thiL)

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Thiamin

XF1924 L-aspartate oxidase 40% XF1961 NH3-dependent NAD synthetase (nadE/adgA) 47% XF1097 nicotinate phosphoribosyltransferase (pncB) 43% XF1925 nicotinate-mononucleotide pyrophosphorylase (nadC) 54% XF1923 quinolinate synthetase A (nadA) 42%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Pyridine nucleotides

XF0060 pyridoxal phosphate biosynthetic protein (pdxJ) XF0839 pyridoxal phosphate biosynthetic protein (pdxA) XF1337 pyridoxamine 5’-phosphate oxidase (fprA)

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Pyridoxine

XF0229 3-methyl-2-oxobutanoate hydroxymethyltransferase (panB) 55% XF0231 aspartate 1-decarboxylase precursor (panD) 53% XF0230 pantoate--beta-alanine ligase (panC) 46%

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Pantothenate

XF0466 molybdopterin biosynthesis protein (moeB) XF1545 molybdopterin biosynthesis protein

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Molybdopterin 25%

ATP-dependent DNA helicase (rep) ATP-dependent helicase (yoaA) ATP-dependent helicase chromosomal replication initiator (dnaA) DNA gyrase subunit A (gyrA) DNA gyrase subunit B (gyrB) DNA ligase (ligA/lig/dnaL/pdeC/lop) DNA polymerase I (polA/resA) DNA polymerase III holoenzyme chi subunit (holC) DNA polymerase III subunit (dnaX/dnaZ/dnaZX) DNA polymerase III, alpha chain (dnaE/polC) DNA polymerase III, beta chain (dnaN) DNA polymerase III, delta subunit (holB) DNA polymerase III, delta subunit (holA) DNA polymerase III, epsilon chain (dnaQ) DNA primase (dnaG/dnaP/parB) DNA primase (traC) DNA topoisomerase I (topA) DNA topoisomerase III (topB) DNA/pantothenate metabolism flavoprotein (dfp) glucose inhibited division protein (gidB) glucose inhibited division protein A (gidA) helicase, ATP dependent (hrpA) primosomal protein N’ (priA) replicative DNA helicase (dnaB/groP/grpA) TldD protein (tldD) topoisomerase IV subunit (parC) topoisomerase IV subunit B (parE)

chromosome segregation protein (smc) DNA-binding protein (bbh3) DNA-binding protein (fis) histone-like protein (hbs/hbsU) histone-like protein single-stranded DNA binding protein (ssb) single-stranded DNA binding protein (ssb) single-stranded DNA binding protein (ssb) single-stranded DNA binding protein (ssb) single-stranded DNA binding protein (ssb)

30% 29% 53% 73% 72% 54% 38% 91% 45% 53% 81%

53% 38%

32%

55% 54% 35%

27% 33% 58% 64% 54% 49% 56% 36% 54% 25%

51% 45% 65% 50% 44% 53% 61% 61% 64%

48% 51% 35% 29% 49% 46% 56% 41% 48%

40%

34%

43% 43% 43% 53% 57% 60% 46% 52%

66% 41%

XF2598 6-O-methylguanine-DNA methyltransferase (dat/dat1) 46% XF1262 7,8-dihydro-8-oxoguanine-triphosphatase (mutX) 38% XF1909 A/G-specific adenine glycosylase (mutY/mutB) 46% XF0050 DNA helicase II (uvrD/mutU/pdeB/rad/recL) 56%

MACROMOLECULE METABOLISM/DNA METABOLISM/Repair

XF0354 ATP-dependent DNA helicase (recG) XF1381 DNA helicase XF2248 DNA repair protein RecO (recO) XF0003 DNA replication and repair RecF protein (recF/uvrF) XF1892 endonuclease V (deoxyinosine 3’endonuclease) (nfi) XF0425 exodeoxyribonuclease V alpha chain (recD) XF0423 exodeoxyribonuclease V beta chain (recB/rorA) XF0422 exodeoxyribonuclease V gamma chain (recC) XF1425 integrase/recombinase (xerD/xprB) XF0743 integration host factor, alpha subunit (himA) XF2437 integration host factor, beta subunit (himD) XF1809 recombination protein (recR) XF2343 recombination protein N (recN) XF0123 recombination protein RecA (recA) XF1110 single-stranded DNA exonuclease (recJ) XF1483 site-specific recombinase (sss/xerC) XF2028 site-specific recombinase (rin)

MACROMOLECULE METABOLISM/DNA METABOLISM/Recombination

XF2558 XF0446 XF1998 XF1190 XF1943 XF0482 XF1392 XF1558 XF1644 XF1778

MACROMOLECULE METABOLISM/DNA METABOLISM/Structural DNA binding proteins

XF1935 XF2106 XF1383 XF2689 XF0361 XF1127 XF1353 XF1286

XF0002 XF0676 XF2178 XF2157 XF0430 XF2025 XF0920 XF1776 XF0149

XF0204

XF1807

XF2680 XF0882 XF1229 XF0001 XF2552 XF0005 XF2556 XF1103 XF0136

MACROMOLECULE METABOLISM/DNA METABOLISM/Replication

XF0144 biosynthetic arginine decarboxylase (speA) XF1539 S-adenosyl methionine decarboxylase proenzyme (speD) XF0143 spermidine synthase (speE)

45%

(3r)-hydroxymyristoyl ACP dehydrase (fabZ) 49% 3-alpha-hydroxysteroid dehydrogenase 34% 3-oxoacyl-[ACP] reductase 42% 3-oxoacyl-[ACP] reductase (fabG) 66% 3-oxoacyl-[ACP] synthase II (fabF) 60% 3-oxoacyl-[ACP] synthase III 24% acetoacetyl-CoA reductase (phbB) 65% acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha (accA) 64% acetyl-coenzyme A carboxylase carboxyl transferase subunit beta (accD/dedB/usg) 62% acyl carrier protein (acpP) 72% acyl carrier protein (acpC) 40% beta-hydroxydecanoyl-ACP dehydratase (fabA) 63% beta-ketoacyl-[ACP] synthase II (fabF/fabJ) 34% beta-ketoacyl-[ACP] synthase III (fabH) 62% biotin carboxyl carrier protein of acetyl-CoA carboxilase (accB/fabE) 53% biotin carboxylase subunit of acetyl CoA carboxylase (accC/fabG) 76% cardiolipin synthase 33% cardiolipin synthase (cls/nov) 34% ketoreductase 29% malonyl CoA-ACP transacylase (fabD) 51% phosphatidate cytidylyltransferase (cdsA/cds) 36% phosphatidylserine decarboxylase (psd) 44%

BIOSYNTHESIS OF SMALL MOLECULES/POLYAMINES BIOSYNTHESIS

XF1087 XF1209 XF2716 XF0670 XF1049 XF1365

XF0049

XF1639 XF1817 XF0048

XF0672 XF0771 XF0572

XF1467

XF1044 XF2269 XF0173 XF0671 XF0673 XF1970 XF0319 XF0203

BIOSYNTHESIS OF SMALL MOLECULES/FATTY ACID AND PHOSPHATIDIC ACID BIOSYNTHESIS

XF1916 coenzyme F390 synthetase

BIOSYNTHESIS OF SMALL MOLECULES/COFACTORS, PROSTHETIC GROUPS, CARRIERS BIOSYNTHESIS/Others

adenine-specific methylase DNA methylase DNA methyltransferase (sfiIM) DNA methyltransferase (gene with intron) DNA processing chain A (smf/dprA) methyltransferase methyltransferase (kpnIM) site-specific DNA-methyltransferase (sphIM) site-specific DNA-methyltransferase type I restriction-modification system (hsdM) type I restriction-modification system DNA methylase type I restriction-modification system DNA methylase (hsdM) type I restriction-modification system DNA methylase type I restriction-modification system DNA methylase (hsdM) type I restriction-modification system endonuclease type I restriction-modification system endonuclease (hsdR1) type I restriction-modification system endonuclease type I restriction-modification system endonuclease (hsdR) type I restriction-modification system specificity determinant type I restriction-modification system specificity determinant type I restriction-modification system specificity determinant (hsdS) type I restriction-modification system specificity determinant (hsdS)

XF2438 XF1151 XF1174 XF2631 XF1173 XF1165 XF0238 XF0107 XF1161 XF2560 XF1156 XF2580 XF2421 XF0434 XF1158 XF1175 XF1169 XF2561 XF2630 XF1166 XF1536 XF2636 XF2635 XF2637 XF1537 XF1162 XF1171 XF1159 XF1177 XF1168 XF0110 XF1155 XF0740 XF2424 XF1157 XF1154 XF1163 XF2643 XF2423 XF1206 XF1160 XF1152 XF1170 XF1534 XF1816 XF1207 XF2782 XF0739 XF2440 XF1153 XF1164 XF1167 XF2634 XF2559

30S ribosomal protein S1 (rpsA/ssyF) 30S ribosomal protein S10 (rpsJ) 30S ribosomal protein S11 (rpsK) 30S ribosomal protein S12 (rpsL) 30S ribosomal protein S13 (rpsM) 30S ribosomal protein S14 (rpsN) 30S ribosomal protein S15 (rpsO/secC) 30S ribosomal protein S16 (rpsP) 30S ribosomal protein S17 (rpsQ) 30S ribosomal protein S18 (rpsR) 30S ribosomal protein S19 (rpsS) 30S ribosomal protein S2 (rpsB) 30S ribosomal protein S20 (rpsT) 30S ribosomal protein S21 (rpsU) 30S ribosomal protein S3 (rpsC) 30S ribosomal protein S4 (rpsD) 30S ribosomal protein S5 (rpsE/spc) 30S ribosomal protein S6 (rpsF) 30S ribosomal protein S7 (rpsG/rps7) 30S ribosomal protein S8 (rpsH/rps8) 30S ribosomal protein S9 (rpsI/rps9) 50S ribosomal protein L1 (rplA/rpl1) 50S ribosomal protein L10 (rplJ) 50S ribosomal protein L11 (rplK/relC) 50S ribosomal protein L13 (rplM) 50S ribosomal protein L14 (rplN) 50S ribosomal protein L15 (rplO) 50S ribosomal protein L16 (rplP) 50S ribosomal protein L17 (rplQ) 50S ribosomal protein L18 (rplR) 50S ribosomal protein L19 (rplS) 50S ribosomal protein L2 (rplB) 50S ribosomal protein L20 (rplT) 50S ribosomal protein L21 (rplU) 50S ribosomal protein L22 (rplV) 50S ribosomal protein L23 (rplW) 50S ribosomal protein L24 (rplX) 50S ribosomal protein L25 (rplY) 50S ribosomal protein L27 (rpmA) 50S ribosomal protein L28 (rpmB) 50S ribosomal protein L29 (rpmC) 50S ribosomal protein L3 (rplC/rpl3) 50S ribosomal protein L30 (rpmD) 50S ribosomal protein L31 (rpmE) 50S ribosomal protein L32 (rpmF) 50S ribosomal protein L33 (rpmG) 50S ribosomal protein L34 (rpmH) 50S ribosomal protein L35 (rpmI) 50S ribosomal protein L36 (rpmJ) 50S ribosomal protein L4 (rplD) 50S ribosomal protein L5 (rplE/rpl5) 50S ribosomal protein L6 (rplF) 50S ribosomal protein L7/L12 (rplL) 50S ribosomal protein L9 (rplI)

MACROMOLECULE METABOLISM/RNA METABOLISM/Ribosomal proteins

XF2741

XF2726

XF2722

XF0296

XF2739

XF2725

XF2721

XF0295

XF2742

XF2728

XF2723

XF1368 XF2297 XF0641 XF1774 XF0924 XF0935 XF1968 XF1804 XF2313 XF2724 XF0297

MACROMOLECULE METABOLISM/DNA METABOLISM/Restriction, modification

XF0760 DNA mismatch repair protein MutL (mutL) XF0148 DNA repair protein (radC) XF1378 DNA repair protein (radA/sms) XF1299 DNA repair system specific for alkylated DNA (alkB) XF1326 DNA-3-methyladenine glycosidase (mag1) XF1806 DNA-damage-inducible protein (dinD/pcsA) XF2081 DNA-damage-inducible protein (dinJ) XF0647 endonuclease III (nth) XF2426 excinuclease ABC subunit A (uvrA) XF0967 excinuclease ABC subunit B (uvrB) XF2311 excinuclease ABC subunit C (uvrC) XF0164 exodeoxyribonuclease XF2022 exodeoxyribonuclease I (sbcB/xonA/cpeA) XF1933 exodeoxyribonuclease II I (xthA) XF0660 exodeoxyribonuclease small subunit (xseB) XF0755 exodeoxyribonuclease VII large subunit (xseA) XF0071 formamidopyrimidine DNA glycosylase (mutM/fpg) XF0170 formamidopyrimidine DNA glycosylase (mutM/fpg) XF1902 holliday junction binding protein, DNA helicase (ruvB) XF1904 holliday junction binding protein, DNA helicase (ruvA) XF1905 holliday junction resolvase, endodeoxyribonuclease (ruvC) XF1716 mismatch repair protein (mutS) XF0045 transcription-repair coupling factor (mfd) XF2692 uracil-DNA glycosylase (ung)

67% 72% 86% 81% 85% 62% 62% 65% 51% 70% 60% 60% 54% 67% 67% 84% 71% 67% 60% 56% 68% 64% 43% 69% 71% 80% 48% 69% 80% 53% 65% 69% 72% 53% 63% 52% 48% 48% 64% 71% 50% 61% 47% 57% 75% 73% 76% 60% 65% 54% 63% 52% 67% 52%

24%

24%

21%

27%

42%

53%

20%

---%

53%

61%

44%

---%

46% 40% 31% ---% 37% 37% 33% 39% 41% 25%

52% 54% 49% 55%

42%

69%

51%

51%

41%

28% 35% ---% 57% 67% 70% 83% 50% 43% 43% 46% 39%

42% 41% 61%

alanyl-tRNA synthetase (alaS/lovB) 56% arginyl-tRNA synthetase (argS) 40% asparaginyl-tRNA synthetase (asnS/tss) 66% aspartyl-tRNA synthetase (aspS) 53% cysteinyl-tRNA synthetase (cysS) 46% glutaminyl-tRNA synthetase (glnS) 54% glutamyl-tRNA synthetase (gltX) 54% glycyl-tRNA synthetase alpha chain (glyQ/glyS(A)) 74% glycyl-tRNA synthetase beta chain (glyS/glyS(B)) 39% histidyl-tRNA synthetase (hisS) 36% isoleucyl-tRNA synthetase (ileS/ilvS) 54% leucyl-tRNA synthetase (leuS) 58% lysyl-tRNA synthetase (lysU) 56% lysyl-tRNA synthetase (yjeA/genX) 47% methionyl-tRNA formyltransferase (fmt) 56% methionyl-tRNA synthetase (metG) 54% phenylalanyl-tRNA sinthetase beta chain (pheT) 42% phenylalanyl-tRNA synthetase alpha chain (pheS) 59% prolyl-tRNA synthetase (proS/drpA) 55% ribonuclease D (rnd) 31% ribonuclease P (rnpA) 37% S-adenosylmethionine: tRNA ribosyltransferase-isomerase (queA) 52% seryl-tRNA synthetase (serS) 59% threonyl-tRNA synthetase (thrS) 59% tRNA (guanine-N1-)-methyltransferase (trmD) 51% tRNA adenylyltransferase (cca) 52% tRNA delta(2)-isopentenylpyrophosphate transferase (miaA/trpX) 49% tRNA methyltransferase (trmU/asuE) 55% tRNA pseudouridine synthase A (truA/hisT/asuC/leuK) 52% tRNA pseudouridine synthase B (truB) 42% tRNA/rRNA methyltransferase 41% tryptophanyl-tRNA synthetase 34% tyrosyl-tRNA synthetase (tyrS) 53% valyl-tRNA synthetase (valS) 51%

38% 38% 57% 50%

42% 35%

---%

46%

54%

36% 43% 53%

72% 62%

67% 71% 43% 60% 49%

52% 55% 52% 56% 48% 48% 87%

oligoribonuclease (orn) polynucleotide phosphorylase (pnp) ribonuclease ribonuclease H (rnhA/rnh) ribonuclease HII (rnhB) ribonuclease III (rnc) ribonuclease PH (rph) ribonuclease T (rnt)

XF1111 XF0174

XF0576 XF0111 XF2649

XF1445 XF0235 XF0737 XF0857 XF2417 XF2298

XF1018 XF1436 XF1834 XF2629 XF2203 XF2473 XF2579 XF2628 XF2640 XF2553

arginine-tRNA-protein transferase (ate1) disulfide oxidoreductase (dsbA) disulphide isomerase elongation factor G (fusA) elongation factor P (yeiP) elongation factor P (efp) elongation factor Ts (tsf) elongation factor Tu (tufA) elongation factor Tu (tufB) initiation factor eIF-2B, alpha subunit-related initiation factor IF-1 (infA) initiation factor IF-2 (infB) initiation factor IF-3 (infC) L-isoaspartate O-methyltransferase (pcm) lipoprotein signal peptidase (lspA) low molecular weight phosphotyrosine protein phosphatase (stp1) metallopeptidase methionine aminopeptidase (map) peptide chain release factor 1 (prfA/sueB/uar) peptide chain release factor 2 (prfB) peptide chain release factor 3 (prfC)

MACROMOLECULE METABOLISM/PROTEIN METABOLISM/Translation and modification

XF1257 XF0239 XF2615 XF2158 XF1041 XF2246 XF1505 XF2146

61% 66% 62%

40% 37% 60%

50% 73% 49% 60% 47% 45%

25% 38% 30% 70% 39% 63% 50% 82% 82%

58% 64% 44% 60% 55% 50% 63% 47%

MACROMOLECULE METABOLISM/RNA METABOLISM/RNA degradation

ATP-dependent RNA helicase (rhlE) ATP -dependent RNA helicase (deaD) ATP-dependent RNA helicase (rhlB/mmrA) N utilization substance protein A (nusA) polynucleotide adenyltransferase (pcnB) pseudouridylate synthase (rluC) RNA polymerase alpha subunit (rpoA) RNA polymerase beta subunit (rpoB/groN/nitB/rif/ron) XF2632 RNA polymerase beta’ subunit (rpoC/tabB) XF1502 RNA polymerase omega subunit (rpoZ) XF2638 transcription antitermination factor (nusG) XF0955 transcription termination factor (nusB/ssyB) XF2699 transcription termination factor Rho (rho/nitA/psuA/rnsC/tsu) XF1108 transcriptional elongation factor (greA)

XF0192 XF0252 XF2696 XF0234 XF0227 XF2606 XF1176 XF2633

MACROMOLECULE METABOLISM/RNA METABOLISM/RNA synthesis, modification, DNA transcription

XF0237 XF2475 XF0428 XF0169 XF0134

XF1440 XF1373

XF2286 XF0736 XF0109 XF1362 XF0090

XF0445 XF0751 XF2781 XF1314

XF0741

XF2222 XF2418 XF2176 XF1112 XF2555 XF0927 XF0549 XF0742

XF1959

XF0124 XF0147 XF2563 XF1856 XF0995 XF1338 XF0822 XF1960

MACROMOLECULE METABOLISM/RNA METABOLISM/Aminoacyl tRNA synthetases, tRNA modification

XF0108 16S rRNA processing protein (rimM) XF2607 ribonuclease E (rne/ams/hmp1) XF1125 ribonuclease G (cafA/rng) XF0939 ribosomal large subunit pseudouridine synthase D (rluD/sfhB) XF2201 ribosomal protein L11 methyltransferase (prmA) XF2532 ribosomal protein S6 modification protein (rimK) XF1200 ribosomal small subunit pseudouridine synthase (rsuA) XF0236 ribosomal-binding factor A (rbfA) XF0441 ribosomal-protein-alanine acetyltransferase (rimI) XF2358 RNA methyltransferase (ygcA) XF1972 tRNA/rRNA methylase (yibK) XF1985 tRNA/rRNA methyltransferase (yjfH)

MACROMOLECULE METABOLISM/RNA METABOLISM/Ribosomes - maturation and modification

10kDa chaperonin (groES/htpA) 60kDa chaperonin (mopA/groEL) chaperone (lolA) DnaJ protein (dnaJ) DnaJ protein (dnaJ) DnaK protein (dnaK/grpF/groP/seg) DnaK supressor heat shock protein G (htpG) heat shock protein GrpE (grpE) peptidyl-prolyl cis-trans isomerase (tig)

64% 41% 33% 28% 28% 65% 66% 33% 36% 34% 63% 50% 44% 53% 47% 33% 81% 22% 28% 27% 25% 39% 40%

61%

69%

73%

73%

30% 47% 36% 46%

59% 77% 32% 41% 59% 72% 48% 58% 38% 32%

34% 58% 40% 39% 54%

59% 49% 38% 43% 32% 25% 60% 36% 37% 55% 35% 51%

24% 31%

36%

XF2392 autolytic lysozyme (lyc) XF0847 beta-hexosaminidase precursor (nahA) XF2184 membrane-bound lytic transglycosylase (mltB)

CELL STRUCTURE/MEMBRANE COMPONENTS/Outer membrane constituents

XF2780 60kDa inner-membrane protein XF1640 ankyrin-like protein (ank2) XF2235 bifunctional penicillin-binding protein 1C precursor (pbpC) XF0082 chaperone protein precursor (ecpD) XF0851 D-amino acid dehydrogenase subunit (dadA/dadR) XF2334 diacylglycerol kinase (dgkA) XF0340 disulfide bond formation protein B (dsbB) XF2433 epimerase/dehydratase protein (capD) XF0256 glucose-1-phosphate thymidylyltransferase (rfbA) XF0375 inner membrane protein (yjdB) XF0103 membrane protein XF0764 membrane protein XF0777 membrane protein (actII-3) XF2230 penicillin-binding protein 6 precursor (dacC) XF0151 phosphomannomutase (algC) XF1183 polysaccharide biosynthetic protein (vipA/tviB) XF2333 prolipoprotein diacylglyceryl transferase (lgt/umpA) XF1310 rod shape-determining protein (mreC) XF1313 rod shape-determining protein (mrdB/rodA) XF1979 transmembrane protein (ampE) XF1140 UDP-N-acetylglucosamine pyrophosphorylase (glmU)

CELL STRUCTURE/MEMBRANE COMPONENTS/Inner membrane

XF2310 CDP-diacylglycerol-glycerol-3-phosphate 3phosphatidyltransferase (pgsA) XF1031 glycerol-3-phosphate acyltransferase (plsB)

36%

30% 37%

49%

48% 37% 48% 28%

52%

78% 36% 25% 36% 22% 47% 45%

54% 54% 33% 37%

47% 40%

40% 26%

49% 37%

MACROMOLECULE METABOLISM/OTHER MACROMOLECULES METABOLISM/Phospholipids

XF2714 alpha-L-fucosidase (fucA1) XF2278 dolichol-phosphate mannosyltransferase (dpm1) XF0887 mannosyltransferase (mtfA)

MACROMOLECULE METABOLISM/OTHER MACROMOLECULES METABOLISM/Polysaccharides

XF2260 alanyl dipeptidyl peptidase XF0138 aminopeptidase A/I (pepA/xerB/carP) XF1488 aminopeptidase N XF2009 aminopeptidase P (pepP) XF1188 ATP-dependent Clp protease ATP binding subunit Clpx (clpX/lopC) XF1187 ATP-dependent Clp protease proteolytic subunit (clpP/lopP) XF0381 ATP-dependent Clp protease subunit (clpB/htpM) XF1443 ATP-dependent Clp protease subunit (clpA/lopD) XF1189 ATP-dependent serine proteinase La (lon/capR/deg/muc/lopA) XF2704 carboxyl-terminal protease (ctpA) XF1282 carboxypeptidase related protein XF0156 cysteine protease (cpr6) XF0015 dipeptidyl-peptidase XF1484 heat shock protein (hslV/htpO) XF1485 heat shock protein (hslU/htpI) XF0452 integral membrane protease (hflK/hflA) XF0453 integral membrane proteinase (hflC) XF0280 leucine aminopeptidase (pepA) XF0435 O-sialoglycoprotein endopeptidase (gcp) XF0127 oligopeptidase A (prlC/opdA/optA) XF1479 peptidase (ptrB/tlp) XF1944 peptidyl-dipeptidase (dcp) XF2241 periplasmic protease (mucD) XF0220 proline dipeptidase (pepQ) XF1510 proline imino-peptidase (pip/xap) XF2330 proteinase (slpD) XF0267 serine protease (pspB) XF1026 serine protease (pspB) XF1851 serine protease XF1823 tail-specific protease (prc/tsp) XF0816 zinc protease

MACROMOLECULE METABOLISM/PROTEIN METABOLISM/Protein degradation

XF0616 XF0615 XF1452 XF2233 XF2339 XF2340 XF0991 XF0978 XF2341 XF1186

MACROMOLECULE METABOLISM/PROTEIN METABOLISM/Chaperones

XF1940 peptide methionine sulfoxide reductase (msrA/pms) XF2642 peptidyl tRNA hydrolase (pth) XF0644 peptidyl-prolyl cis-trans isomerase (mip) XF0652 peptidyl-prolyl cis-trans isomerase (slyD) XF0838 peptidyl-prolyl cis-trans isomerase (surA) XF1191 peptidyl-prolyl cis-trans isomerase (ppiD) XF1212 peptidyl-prolyl cis-trans isomerase (ppiB/ppi) XF1605 peptidyl-prolyl cis-trans isomerase XF0442 phosphatidyltransferase (ptr) XF0926 polypeptide deformylase (def/fms) XF2156 protein phosphatase (yloO) XF1446 protein transferase (aat) XF2585 protein-L-isoaspartate O-methyltransferase (pcm) XF1051 ribosome recycling factor (frr) XF2244 signal peptidase I (lepB) XF1437 thiol:disulfide interchange protein (dsbA) XF0353 translation initiation inhibitor 25%

29% 37% 23% 23% 37%

37% 22% 31% 30% 43% 28% 34% 43% 38% 42% 31%

XF0077 XF0078 XF0080 XF0369 XF0370 XF0371 XF0372 XF0373 XF0478

fimbrial adhesin precursor (mrkD) fimbrial adhesin precursor (mrkD) fimbrial adhesin precursor (fimA/pilA) fimbrial assembly membrane protein (pilM) fimbrial assembly membrane protein (pilN) fimbrial assembly membrane protein (pilO) fimbrial assembly protein (pilP) fimbrial assembly protein (pilQ) fimbrial assembly protein (pilY1)

CELL STRUCTURE/SURFACE STRUCTURES 30% 32% 29% 61% 44% 43% 38% 34% 33%

XF1289 2-dehydro-3-deoxyphosphooctonate aldolase (kdsA) 44% XF0105 3-deoxy-D-manno-octulosonic acid transferase (kdtA/waaA) 44% XF2299 3-deoxy-manno-octulosonate cytidylyltransferase (kdsB) 50% XF1419 acetyltransferase (lpxD/firA/omsA) 26% XF0918 acyl-[ACP]-UDP-N-acetylglucosamine (lpxA) 32% XF1994 beta 1,4 glucosyltransferase 29% XF0612 dolichol-phosphate mannosyltransferase (dmt) 24% XF1638 dolichyl-phosphate mannose synthase related protein 34% XF0257 dTDP-4-dehydrorhamnose 3,5-epimerase (rfbD) 75% XF0258 dTDP-4-keto-L-rhamnose reductase (rfbC) 54% XF0255 dTDP-glucose 4,6-dehydratase (rfbB) 76% XF0611 dTDP-glucose 4-6-dehydratase (rfbB) 63% XF1637 glycosyl transferase (spsQ) 29% XF1082 lipid A 4’-kinase (lpxK) 38% XF0104 lipid A biosynthesis lauroyl acyltransferase (htrB/waaM) 40% XF1348 lipid A biosynthesis lauroyl acyltransferase (htrB/waaM) 26% XF1042 lipid A disaccharide synthase (lpxB/pgsB) 45% XF0879 lipopolysaccharide biosynthesis protein (rfbU) 26% XF2434 lipopolysaccharide core biosynthesis protein 37% XF0980 lipopolysaccharide synthesis enzyme (kdtB) 56% XF0778 O-antigen acetylase (oafA) 26% XF1413 polysialic acid capsule expression protein (kpsF) 45% XF2154 saccharide biosynthesis regulatory protein (opsX) 73% XF0176 sugar transferase 43% XF1045 UDP-3-O-(R-3-hydroxymyristoyl)-glucosamine N-acyltransferase (lpxD/firA/ssc) 41% XF1646 UDP-3-O-(R-3-hydroxymyristoyl)-glucosamine N-acyltransferase (lpxD/firA) 28% XF0486 UDP-3-O-[3-hydroxymyristoyl] glucosamine Nacyltransferase (lpxD) 31% XF0803 UDP-3-O-[3-hydroxymyristoyl] Nacetylglucosamine deacetylase (lpxC) 59% XF1415 UDP-N-acetylglucosamine 1carboxyvinyltransferase (murA/murZ) 62% XF1043 UDP-N-acetylglucosamine acyltransferase (lpxA) 44%

CELL STRUCTURE/SURFACE POLYSACCHARIDES, LIPOPOLYSACCHARIDES, AND ANTIGENS

XF0852 alanine racemase (alr) 45% XF0799 D-alanine--D-alanine ligase B (ddlB/ddl) 51% XF0855 lipoprotein (nlpD/lppB) 39% XF0416 lipoprotein precursor (vacJ) 36% XF1715 monofunctional biosynthetic peptidoglycan transglycosylase (mtgA) 50% XF0759 N-acetylmuramoyl-L-alanine amidase precursor (amiC) 45% XF2646 outer membrane lipoprotein precursor (lolB/hemM) 29% XF1896 outer membrane protein P6 precursor (pal/ompP6) 37% XF1614 penicillin binding protein (pbp4) 39% XF0368 penicillin-binding protein 1A (mrcA/ponA) 39% XF0884 penicillin-binding protein 1B (ponB) 41% XF1312 penicillin-binding protein 2 (mrdA/pbp2) 38% XF0792 penicillin-binding protein 3 (ftsI/pbpB) 43% XF0795 phospho-N-acetylmuramoyl-pentapeptidetransferase (mraY/murX) 59% XF2185 rare lipoprotein A (rlpA) 30% XF1309 rod shape-determining protein (mreB/envB/rodY) 77% XF1311 rod shape-determining protein (mreD) 32% XF0907 soluble lytic murein transglycosylase precursor (yjbJ) 42% XF0797 UDP-N-acetylglucosamine--N-acetylmuramyl(pentapeptide) pyrophosphoryl-undecaprenol (murG) 47% XF0798 UDP-N-acetylmuramate--alanine ligase (murC) 53% XF0276 UDP-N-acetylmuramate-L-alanine ligase (mpl) 50% XF1118 UDP-N-acetylmuramoylalanine--D-glutamate ligase (murD) 29% XF0793 UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (murE) 43% XF0794 UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6 diaminopimelate--D-alanyl-D-alanyl ligase (murF/mra) 41% XF2572 UDP-N-acetylpyruvoylglucosamine reductase (murB) 43% XF1050 undecaprenyl pyrophosphate synthetase (uppS/rth) 51%

CELL STRUCTURE/MUREIN SACCULUS, PEPTIDOGLYCAN

outer membrane antigen (oma) outer membrane hemin receptor (phuR) outer membrane protein (mopB) outer membrane protein (ompW) outer membrane protein outer membrane protein (ompP1) outer membrane protein outer membrane protein (romA) outer membrane protein (smpA) outer membrane protein H.8 precursor outer membrane protein Slp precursor (slp) peptidoglycan-associated outer membrane lipoprotein precursor (pcp/lpp) XF0033 PilE protein (pilE) XF0975 polyphosphate-selective porin O (oprO) XF0321 porin O precursor (oprO) XF0028 pre-pilin like leader sequence (fimT) XF1363 soluble lytic murein transglycosylase precursor (slt/sltY)

XF1046 XF0384 XF0343 XF0872 XF0873 XF1053 XF1123 XF1739 XF2345 XF1024 XF1811 XF1547

fimbrial assembly protein (pilC) fimbrial protein fimbrial protein fimbrial subunit precursor fimbrillin fimbrillin fimbrillin (fimA) fimbrillin outer membrane usher protein precursor (fimD) pilus biogenesis protein (pilJ) pilus biogenesis protein (pilI) pilus biogenesis protein (pilB) pilus protein (pilG) PilX protein (pilX) PilY1 gene product (pilY1) PilY1 gene product (pilY1) pre-pilin leader peptidase (xpsO) pre-pilin leader sequence (pilV) twitching motility protein (pilU) twitching motility protein (pilT) type 4 fimbriae assembly protein (pilZ) type IV pilin (pilE) 36% 46% 33% 55% 82% 28% 32% 30% 76% 33% 60% 74% 62% 35%

56% 48% 53% 42% 37% 37% 38% 48%

43% 28% 43% 41%

61%

65%

46%

50% 49%

53% 64%

50%

67%

26%

35% 28% 35%

ABC transporter sodium permease (natB) ammonium transporter (amtB) bacterioferritin (bfr) cation:proton antiporter (ybaL) ferric enterobactin receptor (bfeA) ferric enterobactin receptor (bfeA) ferrous iron transport protein ferrous iron transport protein B (feoB) ion transporter magnesium and cobalt transport protein (corA) manganese transport protein (mntH2) Mg++ transporter (mgtE) Na+/H+ exchange protein Na+:H+ antiporter (yjcE) periplasmic iron-binding protein (afuA) polar amino acid transporter (ybeX) potassium uptake protein (kup/trkD) sodium ABC transporter ATP-binding protein (natA) TonB-dependent receptor for iron transport (ybiL) TonB-dependent receptor for iron transport (yncD) voltage-gated potassium channel beta subunit

50%

23%

47%

47%

33% 64% 26% 24% 42% 24% 46% 48%

27% 57% 61% 57% 22% 22% 47% 52% 48%

52% 53%

60% 40%

64% 66% 31% 48% 39% 36%

40% 37% 24%

XF2455 heme ABC transporter ATP-binding protein (ccmA) XF2456 heme ABC transporter membrane protein (ccmB) XF2457 heme ABC transporter membrane protein (ccmC) XF2261 oligopeptide transporter XF0806 preprotein translocase SecA subunit (secA/prlD/azi/pea) XF1172 preprotein translocase SecY subunit (secY/prlA) XF2639 preprotein translocase subunit (secE) XF0224 preprotein translocase YajC subunit (yajC) XF2685 protease IV (sppA) XF0225 protein-export membrane protein (secD) XF0226 protein-export membrane protein (secF) XF0304 protein-export membrane protein (secG) XF1801 protein-export protein (secB) XF0562 sec-independent protein translocase (tatC/mttB) XF0073 signal recognition particle protein (ffh)

47% 63%

55% 33% 36% 40% 44% 38% 43% 43%

60%

48% 40%

50%

41%

CELLULAR PROCESSES/TRANSPORT/Protein, peptide secretion

XF0367

XF1496

XF0599

XF1015 XF1401 XF1398 XF2019 XF0324 XF0901 XF1903 XF2329

XF2328 XF1844 XF0395 XF2140 XF2134 XF2137 XF0932 XF0933 XF1426 XF0900

CELLULAR PROCESSES/TRANSPORT/Cations

XF2446 ABC transporter sugar permease XF2447 ABC transporter sugar permease (lacF) XF2448 ABC transporter sugar-binding protein (malE) XF0087 alpha-ketoglutarate permease symporter (kgtP/witA) XF0976 C4-dicarboxylate transport protein (dctA) XF1462 glucose/galactose transporter (gluP) XF1609 glucose/galactose transporter (gluP) XF2267 glycerol uptake facilitator protein (glpF) XF1406 HPr kinase/phosphatase (ptsK) XF0320 Mg++/citrate complex transporter (citN/citH) XF1402 phosphotransferase system enzyme I (phbI) XF1403 phosphotransferase system HPr enzyme (phbH) XF1067 sugar ABC transporter ATP-binding protein

CELLULAR PROCESSES/TRANSPORT/Carbohydrates, organic acids, alcohols

XF0411 ABC transporter nitrate permease (nrtB) XF2141 ABC transporter phosphate binding protein (phoX) XF2142 ABC transporter phosphate permease (pstC/phoW) XF2143 ABC transporter phosphate permease (pstA/phoT) XF1344 ABC transporter sulfate binding protein (sbp) XF1345 ABC transporter sulfate permease (cysU/cysT) XF1346 ABC transporter sulfate permease (cysW) XF0412 nitrate ABC transporter ATP-binding protein (nrtD) XF2144 phosphate ABC transporter ATP-binding protein (pstB/phoT) XF1347 sulfate ABC transporter ATP-binding protein (cysA)

CELLULAR PROCESSES/TRANSPORT/Anions

XF0408 amino acid transporter (yhdG) XF2730 amino acid transporter (rhtC) XF2207 cationic amino acid transporter XF2208 cationic amino acid transporter XF1891 di-tripeptide ABC transporter membrane protein (yclF) XF0656 glutamate symport protein (gltT) XF1937 proton glutamate symport protein (gltP)

CELLULAR PROCESSES/TRANSPORT/Amino acids, amines

XF1953 XF1954 XF2544 XF1955 XF0031 XF0032 XF1224 XF2537 XF0029 XF1632 XF1633 XF0677 XF0479

XF2538 XF2539 XF2542 XF0083 XF0487 XF0538 XF0539 XF1791 XF0081 XF0875 XF0944 XF1077 XF1081 XF1223 XF1302 XF1409 XF1475

46%

ABC transporter ATP-binding protein (yusC) 48% ABC transporter ATP-binding protein (yjjK) 69% ABC transporter ATP-binding protein (ycfV) 55% ABC transporter ATP-binding protein (msbA) 41% ABC transporter ATP-binding protein (yadG) 54% ABC transporter ATP-binding protein 42% ABC transporter ATP-binding protein 60% AB

CELLULAR PROCESSES/TRANSPORT/Other

XF1913 type V secretory pathway protein (mttC)

articles

159