Inferring community dynamics of ... - Wiley Online Library

5 downloads 13136 Views 3MB Size Report
Oct 28, 2013 - chloroethene and trichloroethene as terminal electron acceptors and ... formate as electron donors. .... github.com/ianpgm/RDH_microarray.
RESEARCH ARTICLE

Inferring community dynamics of organohalide-respiring bacteria in chemostats by covariance of rdhA gene abundance Ian P.G. Marshall1, Mohammad F. Azizian2, Lewis Semprini2 & Alfred M. Spormann1,3 1

Department of Civil and Environmental Engineering, Stanford University, Stanford, CA, USA; 2Department of Chemical, Biological and Environmental Engineering, Oregon State University, Corvallis, OR, USA and 3Department of Chemical Engineering, Stanford University, Stanford, CA, USA

Correspondence: Alfred M. Spormann, Clark Center E-250, MC 5429, Stanford, CA 94305, USA. Tel.: +1 (650) 723 3668; fax: +1 (650) 724 4927; e-mail: [email protected] Present address: Ian P.G. Marshall, Center for Geomicrobiology, Aarhus University, Aarhus, Denmark

MICROBIOLOGY ECOLOGY

Received 21 June 2013; revised 26 September 2013; accepted 27 September 2013. Final version published online 28 October 2013. DOI: 10.1111/1574-6941.12235 Editor: Alfons Stams Keywords DNA microarray; reductive dehalogenases; Dehalococcoides mccartyi; Desulfitobacterium; chemostat.

Abstract We have developed a novel approach to identifying and quantifying closely related organohalide-respiring bacteria. Our approach made use of the unique genomic associations of specific reductive dehalogenase subunit A encoding genes (rdhA) that exist in known strains of Dehalococcoides mccartyi and Desulfitobacterium and the distinguishing covariance pattern of observed rdhA genes to assign genes to unknown strains. To test this approach, we operated five anaerobic reductively dechlorinating chemostats for 3–4 years with tetrachloroethene and trichloroethene as terminal electron acceptors and lactate/ formate as electron donors. The presence and abundance of rdhA genes were determined comprehensively at the community level using a custom-developed Reductive Dehalogenase Chip (RDH Chip) DNA microarray and used to define putative strains of Dehalococcoides mccartyi and Desulfitobacterium sp. This monitoring revealed that stable chemical performance of chemostats was reflected by a stable community of reductively dechlorinating bacteria. However, perturbations introduced by, for example, electron donor limitation or addition of the competing electron acceptor sulfate led to overall changes in the chemostat performance, including incomplete reduction in the chloroethene substrates, and in the population composition of reductively dehalogenating bacteria. Interestingly, there was a high diversity of operationally defined D. mccartyi strains between the chemostats with almost all strains unique to their specific chemostats in spite of similar selective pressure and similar inocula shared between chemostats.

Introduction While much is known about metabolic interactions of diverse microbial species within mixed communities, our understanding of the dynamics of closely related strains within a population of a microbial species in the same environment is much less developed. However, the abundance of related strains including associated subtle physiological differences has significant implications for the stability and resilience of a microbial ecosystem, for example in the human gut and in natural and engineered environments (Lozupone et al., 2012). The key challenge for identifying and monitoring closely related strains in a mixed community is mainly methodological: How can ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

we detect and distinguish between closely related strains in the absence of faithfully isolating and characterizing all strains present? We took advantage of the distinct genome architecture and intrinsic metabolic dependence on the associated community of reductively dehalogenating Dehalococcoides mccartyi to explore one route into identifying and understanding the dynamics of closely related strains under steady conditions. Dehalococcoides mccartyi, belonging to a group of deeply branching Chloroflexi, is the key microbial group essential for complete reductive dehalogenation of the chloroethenes perchloroethene (PCE) and trichloroethene (TCE), the most prevalent groundwater contaminants in the United States (McCarty, 1997; L€ offler et al., 2013). FEMS Microbiol Ecol 87 (2014) 428–440

429

Community dynamics in dechlorinating chemostats

Other groups of bacteria are also involved in partial dehalogenation. The removal of an individual chloride from the carbon backbone of higher-chlorinated ethenes is typically carried out in a stepwise manner, with different strains carrying specific reductive dehalogenases catalyzing individual dehalogenating reactions. Reductive dehalogenases are a class of corrinoid- and Fe-S cluster-containing, oxygen-sensitive enzymes and catalyze the reduction in a carbon-halogen bond (Wohlfarth & Diekert, 1997). Reductive dehalogenases consist of the catalytically active subunit RdhA and the putative membrane anchor subunit RdhB. Different RdhAs are specific to the dehalogenation of different organohalide substrates. For example, PCE reduction is mediated by PceA (Neumann et al., 1995), TCE and cis-1,2-dichloroethene (DCE) reduction by TceA (Magnuson et al., 2000), and DCE/VC reduction by VcrA (M€ uller et al., 2004) or BvcA (Krajmalnik-Brown et al., 2004). Thus, a metabolically interacting community containing multiple Dhc strains is often necessary for the complete dechlorination of PCE and therefore successful bioremediation. The substrate specificity of the majority of reductive dehalogenases identified through genomic and metagenomic sequencing is unknown. Physiological studies and comparative genomic analyses have revealed that Dhc are strictly anaerobic and highly niche-adapted toward reductive dehalogenation. This specialized lifestyle is reflected by the high numbers of putative reductive dehalogenase genes in the genomes of Dhc strains. The sequenced Dhc genomes share a contextually conserved core that is interrupted by two high plasticity regions (HPRs) near the replication origin (Ori), which contain the majority of the reductive dehalogenase (rdh) genes (Kube et al., 2005; Seshadri et al., 2005; McMurdie et al., 2009). Despite highly similar 16S rRNA genes, the sequenced genomes share only three core rdh genes, contrasted with the high number of rdh genes per genome (36 in strain VS). Interestingly and importantly for this study, each Dhc strain thus carries a distinguished set of rdh genes, giving each strain idiosyncratic ecophysiological and genomic characteristics. A possible reason for this uniqueness of Dhc strains may be found in the evidence for frequent horizontal transfer of rdhA. This evidence has taken the form of the HPRs identified in sequenced Dhc genomes (Kube et al., 2005; Seshadri et al., 2005; McMurdie et al., 2009), unusual codon usage patterns (McMurdie et al., 2007), discontinuous sequence identity downstream of tceA (Krajmalnik-Brown et al., 2007), and comparison of vcrA-containing genomic islands (McMurdie et al., 2011). In this study, we exploited this linkage of specific rdh genes among each other and to the Dhc organismal core to define Dhc strains. D. mccartyi populations have been monitored in the past based on their reductive dehalogenase genes through quantitative PCR (qPCR; Holmes FEMS Microbiol Ecol 87 (2014) 428–440

et al., 2006), clone libraries (Futamata et al., 2009), and DNA microarrays (Tasß et al., 2009; Dugat-Bony et al., 2011, 2012). However, such approaches carry no information on linkage of important functional genes to an organism’s core and therefore lack the power to comprehensively define strains within a population. To find linkages between rdhA genes, we constructed and used a tiling DNA microarray targeting 293 rdhA of known and unknown substrate specificity. The tiling microarray method avoids false-positive gene detection to a greater extent than traditional microarray methods that rely on a smaller number of probes targeting each gene (Marshall et al., 2012). We used this rdhA-based approach to identify different strains within a population based on the covariance of rdhA genes using hierarchical clustering, analogous to finding patterns of gene coregulation from pure-culture microarray data (Eisen et al., 1998) or differential coverage binning of metagenomes (Albertsen et al., 2013). Our method is based on the principle that rdhA genes belonging to the same genome will increase or decrease coherently in abundance as reflected in changes in microarray hybridization intensity by approximately the same magnitude from one time point to the next. Based on such coclustering patterns of rdhA, we can hypothesize which groups of rdhA genes belong to a single strain. More time points and greater abundance shifts will increase the accuracy of the strain determination. We used a computational simulation to test 10 000 random abundance shift patterns for a set of five Dehalococcoides strains to determine the accuracy of this approach. We then applied this method to bacterial communities in several long-term reductively dechlorinating chemostats to define dehalogenating strains without their microbiological isolation and to observe whether and how the dehalogenating bacterial community composition changed under stable and perturbed conditions.

Materials and methods Chemostat operation

Five chemostats were used for this study: A 2-L chemostat inoculated with the Victoria-Stanford culture (Rosner et al., 1997; VS2L), a 5-L chemostat inoculated with cells from VS2L after 354 days of operation (VS5L), a 5-L chemostat inoculated with the Evanite culture (Yu & Semprini, 2002, 2004; EV5L), a 2-L chemostat inoculated with cells from EV5L after 68 days of operation (EV2L), and a 2-L chemostat inoculated with the Point Mugu culture (Keeling, 1998; Yu & Semprini, 2004; PM2L). Chemostats were operated under mostly the same conditions as for PM2L described earlier (Berggren et al., 2013) with modifications detailed in Table 1. The hydraulic and mean cell ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

430

I.P.G. Marshall et al.

Table 1. Summary of reductively dechlorinating chemostats used in this study Electron donor Name

Inoculum

Compound

EV5L EV2L VS2L VS5L PM2L

Evanite EV5L Victoria/Stanford VS2L Point Mugu

Formate Formate Formate Formate Lactate

Electron acceptor Concentration (mM) 45 45/25* 25 45 4.3

Compound

Concentration (mM)

Total days

TCE TCE TCE TCE PCE/sulfate

10 10 10 10 1.12/1.0†

1731 1799 1425 1071 1771

*Formate concentration was modified after 168 days of operation. Sulfate was added to the influent after 733 days of operation.



residence time for all chemostats was around 50 days, representing very slow rates of growth thought to be found in natural groundwater aquifers. Two concentrations of formate electron donor were used: 45 mM/90 milli-electron equivalents (mEq; excess) and 25 mM/50 mEq (limiting, as 30 mM/60 mEq formate would be necessary for the reduction of 10 mM/60 mEq TCE). Chemical analysis

Analysis of chloroethenes PCE, TCE, cis-dichloroethene (cis-DCE), VC, ethene, and molecular hydrogen (H2) was carried out by headspace gas chromatography as previously described (Azizian et al., 2008, 2010). Simulated computational chemostat

A simulated chemostat study was performed with a simulated mixture of five D. mccartyi strains (strain 195, BAV1, CBDB1, VS, and GT) changing abundance randomly and measured across six different time points. This simulation was implemented in R in the script ‘Dhc_genome_array_simulation.R’ available at https:// github.com/ianpgm/RDH_microarray. More information about this simulation is found in the Supporting Information, Supplemental Methods S1. DNA microarray design

RdhA amino acid sequences were collected based on a search for relevant gene annotations in the NCBI protein database and Integrated Microbial Genomes and Metagenomes (IMG/M) database in August 2011 (Markowitz et al., 2012). The 889 rdhA predicted to encode reductive dehalogenase subunit A were retrieved from the NCBI nucleotide database and IMG/M. To ensure multiple copies of highly identical or duplicate nucleic acid sequences did not unnecessarily take up space on the microarray, CD-HIT (Li & Godzik, 2006) was used to cluster sequences at 97% nucleic acid sequence identity to choose a representative sequence for each cluster of high identity. This resulted in a final number of 293 rdhA genes for the ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

RDH Chip. Of these 293 genes, 112 were from Dhc, 20 from Dehalogenimonas, 18 from Desulfitobacterium (Dsb), five from Shewanella, four from Dehalobacter (Dhb), three from Anaeromyxobacter, three from Sulfurospirillum, 85 from metagenomes and other isolation-independent sequencing, and 43 from various other genomes. At the time of design, a significantly greater number of rdhA sequences were known from Dehalococcoides than from other genera. The complete list of genes targeted by the array can be found in Supporting Information, Table S1. Overlapping 60-mer probe sequences were designed from each rdhA at 2.09 coverage in the same way as previously described for the Hydrogenase Chip (Marshall et al., 2012). These probe sequences were used to design a 8 9 15 K format DNA microarray that was synthesized by Agilent Technologies in Santa Clara, CA. DNA extraction, amplification, labeling, and hybridization

DNA extraction from chemostat samples, multipledisplacement amplification, Cy3-labeling, and hybridization to the RDH Chip was carried out as previously described (Marshall et al., 2012). DNA microarray data analysis

Determination and normalization of spot intensities, calculation of bright probe fractions, and calculation of log intensity ratios were carried out as previously described (Marshall et al., 2012). Briefly summarized here, each spot on the array was said to be ‘bright’ or ‘dark’ depending on whether or not the normalized fluorescence intensity was more or less than three times the median spot intensity, respectively. A gene was said to be present if more than 90% of its target probes were bright or, in other words, if it’s bright probe fraction (BPF) was > 90%. Linear regression for the test hybridization with genomic DNA from D. mccartyi VS was carried out using the lm function in R version 2.15.2 (R Development Core Team, 2012). Nucleotide sequences of detected rdhA were clustered using CD-HIT at 90% identity and genes from each CD-HIT FEMS Microbiol Ecol 87 (2014) 428–440

431

Community dynamics in dechlorinating chemostats

cluster with the highest log intensity ratio at any time point were further analyzed. These genes were placed into dendrograms by hierarchical clustering based on log intensity ratios using the R functions dist and hclust with default settings. rdhA were grouped into operational strains using the rect.hclust function. For the PM2L chemostat, the changing abundances of these putative strains were compared with the changing abundances of Dhc phylotypes identified based on hupL clone library abundance from earlier work (Berggren et al., 2013) using Pearson’s coefficient of correlation as implemented in the cor.test function in R. These putative strains were placed into dendrograms (hclust with default settings) based on presence/absence of different rdhA using Jaccard distances implemented in the vegdist function of the vegan package version 2.0-5 in R. The different chemostats were clustered using the same method, only using the ‘single’ clustering method rather than the default ‘complete’.

Results RDH chip testing and validation

FEMS Microbiol Ecol 87 (2014) 428–440

Computational simulation

80

85

90

95

100

To estimate the expected accuracy of strain assignment based on covariance of rdhA abundance shifts, we used a computational simulation. The rdhA content of five different Dhc strains with sequenced genomes (195, GT, CBDB1, BAV1, VS) was used to simulate randomly shifting strain abundances across six time points. Of the 82, 90%-clustered rdhA that these genomes possess the rdhA genes shared between multiple genomes are summarized in Supporting Information, Fig. S1. rdhA abundances were calculated based on strain abundances using the coefficient of variation measured for the hybridization experiment with DNA from Dhc strain VS. A total of

75

Fig. 1. Bright probe fraction (BPF) and sequence identity for non-VS rdhA observed for D. mccartyi strain VS genomic DNA hybridized to the RDH Chip. Straight line shows linear regression, R2 = 0.79.

Sequence identity to VS rdhA (%)

The detection limit and specificity of the tiling DNA microarray approach was assessed in earlier work on the Hydrogenase Chip (Marshall et al., 2012). However, applying this approach to rdhA genes presented a new challenge: While genes encoding hydrogenases are highly diverse, many different rdhA sequences exhibit high nucleotide identity relative to each other, increasing the risk of cross-hybridization and false-positive gene identification on a microarray. Before analyzing rdhA content of the reductively dehalogenating chemostats, we first determined the extent of this cross-hybridization. In a test hybridization, genomic DNA from D. mccartyi strain VS was hybridized to the RDH Chip. The RDH Chip contained probes targeting 43 reductive dehalogenase (rdhA) genes from sources other than the VS genome with

relatively high nucleotide identity (74.8–96.8%) to genes in the VS genome (Table S1). Twelve of these were from the genomes of D. mccartyi GT, nine from D. mccartyi 195, nine from D. mccartyi BAV1, three from D. mccartyi CBDB1, two from D. mccartyi MB, seven from various rdhA clone libraries, and one from the metagenome of the Dhc-containing KB-1 enrichment culture. In Fig. 1, the BPF for each of these is plotted against its sequence identity to its closest relative in the VS genome. The regression calculated showed a linear correlation between BPF and sequence identity for these rdhA genes (R2 = 0.79), although this relationship appears to become nonlinear in the 90– 100% sequence identity range. This shows that target gene sequences that are not perfectly identical to the probe sequence on the array will still hybridize to the microarray. This demonstrates that the BPF cutoff of 90% that was found to avoid false-positive gene detection in previous work (Marshall et al., 2012) is a necessary trade-off between the need to prevent false-positive gene detection and the need to detect genes in environmental samples that differ slightly (up to c. 10% nucleotide identity) from genes in sequenced genomes and metagenomes. In light of this information, we chose to continue to use the BPF cutoff of 90%.

0

20

40

60

80

100

Bright probe fraction (%)

ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

432

10 000 computational trials were performed, with each trial testing different strain abundance shifts and normally distributed rdhA abundances resulting from those shifts. Results of these 10 000 trials are summarized in Fig. S2. Each trial correctly identified an average of 3.31 of the five strains, and among the correctly identified strains, a median of 68.3% of their rdhA genes were correctly assigned. Factors such as a greater number of time points, a smaller coefficient of variation, a smaller number of strains, and less rdhA genes shared between multiple strains would improve the method’s accuracy. Chemostat performance

VS2L, VS5L, and EV5L were maintained under constant operating conditions, while PM2L and EV2L were subjected to metabolic perturbations to investigate how reductive dechlorination performance and the community of reductively dechlorinating bacteria would respond. The perturbations were introduced after 733 days of operation by modifying the media fed into PM2L to include 1 mM sulfate, and after 180 days of operation by modifying the influent formate concentration to EV2L from 45 to 25 mM. Table 1 provides a summary overview of the chemostats. In the three chemostats with unchanged influent composition (VS2L, VS5L, and EV5L; Fig. 2), chloroethenes were for the most part completely dechlorinated to ethene, with the exception of the electron-donor-limited VS2L that never achieved complete dechlorination of VC (Fig. 2). In the two chemostats (PM2L and EV2L) where chemical parameters were changed, both changes negatively impacted the completeness of dechlorination (Fig. 3). This became evident in EV2L as VC reduction stalled after the influent formate concentration was reduced from excess to stoichiometrically limiting, and in PM2L as VC reduction declined after a c. 300 day lag time following sulfate addition, with cis-DCE dechlorination performance also declining during the final c. 150 days of operation. Chemostats fed excess formate produced 2–10 mM acetate, while those fed limited formate produced < 0.025 mM acetate.

I.P.G. Marshall et al.

samples analyzed (Tables S2 – S6), which were condensed to 72 representative genes following application of the 90% nucleotide clustering algorithm described in the methods section. The different chemostats were clustered based on their rdhA content (Fig. 4a), revealing that VS2L and PM2L share 77–86% of their rdhA, EV2L and EV5L share 74–91% of their rdhA, and VS5L shared only 55–65% of its rdhA with the other chemostats. The fact that rdhA genes shared between the two VS-inoculated chemostats were not predictable based on inocula alone suggests that some cross-contamination of the different chemostats may have occurred. The chemostats, however, were constructed and operated to avoid cross-contamination. The 72 identified rdhA genes were placed into three groups based on the known origin of the query genes: 67 of the identified genes were from Dhc genomes or uncultured Dhc, three were from Dsb genomes, and two were from Dhb genomes. Based on the hierarchical clustering of the log intensity ratio of the 72 identified rdhA, genes were grouped together into operational strains (Fig. S3). This means that those genes that increased or decreased in abundance at the same time over the entire experiment were linked and hypothesized to belong to a single strain of dechlorinating bacteria. For the PM2L reactor, the number of hypothetical strains was five (four Dhc and one Dsb strain) based on earlier work using hupL clone libraries (Berggren et al., 2013), and thus operational strains were defined by setting the number of clusters to five (i.e. setting the ‘k’ parameter in the rect.hclust function to k = 5). As this corresponded to a Euclidean distance of c. 3.5, this distance was used to determine the probable number of strains for the remaining chemostats for which hupL clone library estimation of strain numbers had not been performed. For these reactors, the number of hypothetical strains was set based on a distance cutoff of 3.5 (i.e. setting the ‘h’ parameter in the rect.hclust function to 3.5). To see whether the same strain was present in multiple chemostats, strains were clustered based on their rdhA content. Only one operational strain had identical rdhA contents in multiple chemostats: VS2L.1 and PM2L.2 (sharing exclusively Dhb/Dsb rdhA genes; Fig. 4b). All other strains were unique, although many shared a subset of their rdhA genes.

Defining strains of reductively dehalogenating bacteria

Discussion

rdhA composition of the five chemostats was monitored using the RDH Chip for five to eight time points per chemostat operation. Sampling time points were chosen at reasonably even time intervals but also at times with unusual performance or prior to changes in influent composition. Eighty unique rdhA genes were found to have a bright probe fraction of > 90% in at least one of the 36

The RDH Chip was shown to be an effective method for determining trends in abundance shifts of reductively dehalogenating bacteria in laboratory chemostats. Most of the rdhA abundance changes occurred in discernible groups, suggesting that each group belongs to a single Dhc or Dsb genome. With these results, we set out to determine what broad principles of microbial community

ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

FEMS Microbiol Ecol 87 (2014) 428–440

433

Community dynamics in dechlorinating chemostats

2 1 0

40 30 20 10 0

Hydrogen (nM)

50

10 −3 −2 −1 8 6 4 2 0

Chloroethenes/ethene (mM)

Log intensity ratio

3

EV5L

0

500

1000

1500

2 1

80 120 160 200 40 0

Hydrogen (nM)

0 10 −3 −2 −1 8 6 4 2 0

Chloroethenes/ethene (mM)

Log intensity ratio

3

VS2L

0

200

400

600

800

1000

1200

1400

2 0

64 128 192 256 320 0

Hydrogen (nM)

−2 10 −4 8 6 4 2 0

Chloroethenes/ethene (mM)

Log intensity ratio

4

VS5L

0 pceA

200 tceA

vcrA

TCE

400 cis-DCE

VC

ethene

600

800

1000

H2

Fig. 2. Reductive Dehalogenase Chip log intensity ratios and chloroethene/ethene concentrations for the three chemostats in this study maintained with unchanging influent parameters. For the microarray data, each colored line represents a different rdhA gene, with colors corresponding to hypothesized strains identified in Fig. S3. rdhA that encode enzymes with known substrate specificity are identified by shapes with black borders: pceA (AAO60101.1, circle), tceA (ABB89703.1, square), vcrA (AAQ94119.1, triangle pointing up), bvcA (AAT64888.1, triangle pointing down). Vertical dashed lines correspond to days that samples were taken for microarray analysis. For the chloroethene/ethene data, TCE measurements are denoted by blue dots, cis-DCE by orange dots, VC by green dots, ethene by red dots, and H2 by open circles.

FEMS Microbiol Ecol 87 (2014) 428–440

ª 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved

434

I.P.G. Marshall et al.

2 1

12 16 20 8 4 0

Hydrogen (nM)

1.2 −3 −2 −1 0 0.4

0.8