Diversity and Abundance of Nitrate Assimilation Genes in the Northern ...

Microb Ecol (2008) 56:751–764 DOI 10.1007/s00248-008-9394-7

ORIGINAL ARTICLE

Diversity and Abundance of Nitrate Assimilation Genes in the Northern South China Sea Haiyuan Cai & Nianzhi Jiao

Received: 12 March 2007 / Accepted: 3 April 2008 / Published online: 15 May 2008 # Springer Science + Business Media, LLC 2008

Abstract Marine heterotrophic microorganisms that assimilate nitrate play an important role in nitrogen and carbon cycling in the water column. The nasA gene, encoding the nitrate assimilation enzyme, was selected as a functional marker to examine the nitrate assimilation community in the South China Sea (SCS). PCR amplification, restriction fragment length polymorphism (RFLP) screening, and phylogenetic analysis of nasA gene sequences were performed to characterize in situ nitrate assimilatory bacteria. Furthermore, the effects of nutrients and other environmental factors on the genetic heterogeneity of nasA fragments from the SCS were evaluated at the surface in three stations, and at two other depths in one of these stations. The diversity indices and rarefaction curves indicated that the nasA gene was more diverse in offshore waters than in the Pearl River estuary. The phylotype rank abundance curve showed an abundant and unique RFLP pattern in all five libraries, indicating that a high diversity but low abundance of nasA existed in the study areas. Phylogenetic analysis of environmental nasA gene sequences further revealed that the nasA gene fragments came from several common aquatic microbial groups, including the Proteobacteria, Cytophaga–Flavobacteria (CF), and Cyanobacteria. In addition to the direct PCR/ sequence analysis of environmental samples, we also cultured a number of nitrate assimilatory bacteria isolated from the field. Comparison of nasA genes from these isolates and from the field samples indicated the existence of horizontal nasA gene transfer. Application of real-time H. Cai : N. Jiao (*) State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen 361005, People’s Republic of China e-mail: [email protected]

quantitative PCR to these nasA genes revealed a great variation in their abundance at different investigation sites and water depths.

Introduction Traditionally, the role of bacteria in the nitrogen cycle has been considered to be releasing inorganic nitrogen through the decomposition of organic matter, thereby recycling nitrogen and other nutrients to the phytoplankton [33]. However, recent studies have indicated that heterotrophic bacteria use dissolved inorganic nitrogen (DIN), as well as organic matter, and even compete with the phytoplankton for DIN [20, 28, 29]. Heterotrophic bacteria can even be responsible for about 40% of the total nitrate consumption [21]. To detect nitrate assimilatory bacteria (NAB) and examine their community structure in marine environments, one of the structural genes for nitrate assimilation (nasA) has been used as a marker gene. There is a clear genetic distinction between nasA genes from heterotrophic bacteria and autotrophic Cyanobacteria, making this gene a good genetic marker for NAB [1]. Heterotrophic bacterial demand for DIN probably depends on the C/N ratio of the substrates used for growth [10], and so, with high supply of dissolved organic carbon (DOC) relative to DIN, heterotrophic bacteria need more nitrate for C/N balance [18, 32]. Other studies indicate that nitrate concentration can be a good predictor for the variation in nasA associated community structure [2], and the abundance of the Marinobacter sp. nasA gene, for example, is positively correlated with nitrate concentration [2]. However, quantification of marine NAB was only carried out using the nasA gene of Marinobacter sp. Other reported NABs, including

752

Pseudoalteromonas sp., Vibrio sp., Flavobacterium sp., and Halomonas sp. have not been investigated. In addition, functional genes are frequently subjected to horizontal gene transfer among different bacterial groups in marine ecosystems [41]. Because of the lack of NAB cultures isolated from natural marine environments, whether or not the nasA trees are congruent with phylogenies based on 16S rRNA genes, remains unknown. The major aim of the present study was to justify qualitatively and quantitatively the theory that the differences in NO3– concentration and C/N ratio would result in variation in abundance of NAB key members and the structure of the NAB associated community in marine environments. We took the northern South China Sea (SCS) as our study field, as it has distinct natural gradients in nutrients and substrate C/N ratios from the extremely trophic Pearl River estuary to the oligotrophic oceanic waters. We employed molecular techniques and the cultivation approach to address the relevant issues by (1) examining the community structure of nasA genes through polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) and sequence analysis along the salinity and trophic gradients of the northern SCS; (2) quantifying nasA genes at representative sites and water depths; (3) correlating PCR-RFLP determined community composition and nasA gene abundance with environmental variables; and (4) sequencing and analyzing nasA genes and 16S rRNA genes amplified from pure cultures isolated from the SCS for comparison.

H. Cai, N. Jiao

Sample Collection Water samples were collected from 5-, 80-, and 200-m depths at station A1 (115.8° E, 20.1° N), and 5 m at stations A4 (115.2°E, 20.7°N) and A10 (113.8°E, 22.2°N) (Fig. 1) in June 2004. Seawater (5 L) was filtered onto 47-mm diameter 0.2-μm pore-size polysulfone filters (PALL, Ann Arbor, MI, USA) at a pressure 20 bp (fragment sizes smaller than 20 bp are difficult to detect) [8]. For adequate separation, the resulting fragments must be distributed throughout the size range. Enzymes which generate the same size band from various heterotrophic marine bacteria are rejected as this may hide the complexity of the nasA gene pool. Bacterial nasA gene sequences from GenBank databases were examined using the Mapdraw program (Lasergene software package) [5]. Restriction enzymes that theoretically should distinguish different species were identified. Theoretical digestions were simulated from 12 sequences of the nasA gene deposited in the GenBank database: Clostridium oceanicum (AF300591, accession number), Fischerella sp. (AF300592), Leptolyngbya boryana (AF300596), Trichodesmium sp. IMA101 (AF300599), Vibrio diazotrophicus (AF300600), Cytophaga sp. (AF300605), Marinobacter sp. (AF300607), Pseudoalteromonas sp. (AF300608), Vibrio sp. (AF300610), Marinobacter sp. ‘Sargasso Sea s450ao’ (AF503848), Alteromonas sp. ‘Sargasso Sea s450ah’ (AF503850), and Psychrobacter sp. ‘Barents Sea 19’ (AF503853). Eighty clones were chosen randomly from each clone library, and about 750- to 800-bp fragments of the nasA

753

gene were reamplified from these clones and digested by the restriction endonucleases HaeIII and TaqI. RFLP was performed as described somewhere [50] with minor modifications, briefly; PCR products were precipitated and resuspended in distilled water to improve DNA concentration. Restriction digests were performed in 10-μL volumes comprising 1 U of the restriction enzyme (TaKaRa, Dalian, China), and samples were then analyzed on a 3% agarose gel. One clone of typical RFLP type was chosen, and sequencing was performed using the double-stranded dideoxy method on an ABI 3770 automated sequencer (PE Applied Biosystems, Foster City, CA, USA) with sequencing primer M13–47, and approximately 750- to 800-bp sequences were acquired. Statistical Analysis of nasA Libraries To evaluate richness and evenness, different patterns from the RFLP analysis of a sample were identified for the measurement of diversity indices in the nasA libraries using the free software PAST (http://folk.uio.no/ohammer/past). The following indices were calculated: (1) Phylotype richness (S), or the total number of distinct RFLP patterns in the sample; (2) Library coverage (C) was also used as a measure of captured diversity and was calculated using C = 1 − n/N, where n is the number of clone types from a clone library that are encountered only once and N is the total number of clones analyzed [7]; (3) rarefaction curves [23] and diversity indices (Evenness, Shannon-Weaver, Simpson) [37, 39, 40]. Phylogenetic Analysis Primary comparative analysis of the nasA sequences was carried out using the Nucleotide-nucleotide BLAST (blastn) database (http://www.ncbi.nlm.nih.gov/BLAST). The translated amino acid sequences, translated using the EditSeq program (Lasergene software package) [3], along with their closest relatives and some representative sequences of bacterial clones were aligned using the ClustalX 1.8 software package [42]. Methanobacterium thermoformicicum formate dehydrogenase was used as a reference [1], excluding residues corresponding to the primer sequences. Distance, parsimony, and maximum likelihood (ML) methods were used to infer phylogenetic relationships. The Kimura two-parameter model was used for inferring evolutionary distances; trees were constructed using the neighborjoining algorithm and the TREECON software package (version 1.3b) [44, 45]. PROTPARS (PHYLIP, version 3.64; http://evolution.genetics.washington.edu/phylip.html) software was used for parsimony; the PHYLIP programs SEQBOOT, PROTPARS, and CONSENSE were used to derive an MP tree Maximum likelihood analysis was

754

H. Cai, N. Jiao

performed with PROML in PHYLIP using the Jones– Taylor–Thornton amino acid replacement model. Bootstrap analysis of 1,000 replicates was carried out for neighborjoining trees and parsimony trees, and analysis of 100 replicates was carried out for ML trees. After comparison of trees generated using different methods, a consensus tree was constructed by introducing multifurcations where the topology was not resolved. Real-time PCR Real-time PCR analysis was performed using the Rotorgene 3000 system (Corbett Research Co.) in 200-μL optical reaction tubes (Axygene Co.). The reactions were performed with 10-µL TakaRa SYBR Premix Ex Taq (a 2×concentrated mixture of TaKaRa Ex Taq HS, dNTP mixture, SYBR green I, and optimized buffer), and 300 nM of forward and reverse primer. DNA standard curve was constructed using a cloned nasA (this study) sequence. A 10× dilution series of a recombinant plasmid carrying the nasA gene was amplified to create a standard curve. The concentration and purity of plasmid DNA was measured using an mba2000 UV–VIS spectrometer (PerkinElmer Co.) and using the formula “μg DNA × (pmol/660 pg) × (106 pg/μg) × (1/#nucleotides) = pmol DNA”, the DNA molecules per microliter were 1.55×109 copies. Quantitative PCR analysis used a 1-μL aliquot of the community DNA, corresponding to 50 mL of in situ community DNA sample. Quantitative PCR was performed in duplicate for each sample and negative controls (no DNA template) were included in each experiment. The quantification of nasA was based on a mean slope value derived from the standard curves. The real-time PCR involved cycling once at 94°C for 45 s, six repetitions at 94°C for 20 s, an annealing stage beginning at 62°C with 1°C/cycle increments to 56°C, followed by 30 additional cycles at 56°C for 20 s, 72°C for 30 s, and fluorescence acquisition for 15 s at 82°C (a temperature above the melting point of the primer-dimer). The parameter CT (threshold cycle) was automatically determined with the Rotor-gene software version 6.0 (Corbett Research Co.).

The Quantitative PCR program was followed by a melting curve analysis to determine the melting point of the doublestranded DNA (dsDNA) products produced. All positive real-time PCR products were further analyzed by agarose gel (1.5%) electrophoresis to confirm their molecular sizes. Flow Cytometric Analysis Heterotrophic bacteria were analyzed using an Epics Altra II flow cytometer (Beckman Coulter, USA) equipped with a 15-mW 488-nm air-cooled argon-ion laser and a standard filter set-up. Procedures were as described by Jiao et al. [14]. SYBR green I (Molecular Probes) was employed as the nucleic acid stain [25]. Bacterial identification was based on plots of red fluorescence (FL3) vs. green fluorescence (FL1). Nutrients, temperature, salinity, and chlorophyll-a were analyzed according to the standard procedures described in the SCOR protocols [36]. Nucleotide Sequence Accession Number The sequences reported in this paper have been deposited in the GenBank database. Their accession numbers and the accession numbers of the reference sequences are shown in Figs. 4, 5, and 6.

Results Environmental Conditions at the Study Sites Remarkable differences were observed in the environmental variables among the three stations (Fig. 1) and with water depth (Table 1). A10 was more influenced by coastal freshwater than the other two stations, as shown by its much lower salinity, corresponding to its closer location to the shore. Biologically available nutrients in A10 were more abundant than in A4 or A1, consistent with the highest primary production, indicated by chlorophyll a, occurring in A10 (3.28 mg/m3). Similar to the chlorophyll a

Table 1 Environmental variables at the three stations Station A1 A1 A1 A4 A10

Depth (m)

Temp (°C)

Salinity (‰)

NO3– (uM)

NO2– (uM)

PO43– (uM)

Chl a (mg/m3)

Bac* (106 cells/mL)

5 80 200 5 5

29.9 22.1 14.3 29.4 27.1

34.5 34.7 34.6 33.9 13.8

– 0.06 15.66 – 79.31

– – – 0.12 5.54

0.05 0.10 1.16 – 0.63

0.12 0.39 0.01 0.12 3.28

0.24 0.38 0.11 0.43 1.03

– Below detection limit * Data from Flow cytometry

Diversity and Abundance of Nitrate Assimilation Genes

concentrations, total bacterial abundances showed a similar relationship, with highest concentration in A10. In station A1, the nutrient concentrations increased with depth. The chlorophyll maximum appeared in the 80 m layer (0.39 mg/m3) with a temperature of 22.1°C, suitable for bacterial production, and indicated by the highest total bacterial abundance among the different layers (Table 1). Enzyme Choice for nasA RFLP All four enzymes discriminated the 12 reference sequences from GenBank. However, of the four enzymes, only HaeIII (data not shown) could recognize all the sequences. The sequences, which could not be recognized by TaqI, HinfIII, or AluI, accounted for 8–15% of the total sequence. TaqI recognized most of the sequences (except for 8%) and cut out most of the fragments >50 bp, and so both HaeIII and TaqI were finally selected for RFLP analysis. nasA Gene RFLP Analysis DNA samples from the three stations were examined for nasA genes using PCR with nasA-specific primers. The expected amplicon size (~800 bp) was observed in all samples. A total of 345 out of the 400 clones from the different gene libraries showed the correct fragment size. The inserts were restricted with HaeIII and TaqI and screened by RFLP analysis. Clone nomenclature utilized a hyphenated number system. The first number represents the station from which the DNA originated (A1, A4, or A10), the second number represents the depth layer and the third number represents the clone’s number (for example, A1-80m-4 represents the fourth clone from the 80 m depth of station A1). RFLP pattern designations were assigned to the first clone from each station found with that representative pattern. We identified 24 different restriction patterns for 72 screened clones from the A1-200m depth gene library, and 23 different restriction patterns for 69 screened clones from the A10 gene library. A larger number of different patterns (Fig. 2) were detected in the gene libraries derived from A4, A1-5m, and A1-80m, although a smaller number of clones was screened by RFLP analysis. We obtained 33 different patterns for 66 A1–5m surface clones, 35 different patterns for 70 A1-80m clones, and 35 different patterns for 68 A4 clones. Thus, A10 and A1-200m had relatively less nasA diversity than A4, A1-5m, and A1-80m. Table 2 shows the diversity indices that were used to compare the gene libraries. The Shannon–Weaver, Simpson’s, and evenness values indicated that the diversity of nasA sequences from stations A4, A1-5m, and A1-80 differed greatly from sequences obtained from stations A10 and A1-200m. Rarefaction analysis (Fig. 3) confirmed the results obtained

755

with the diversity indices. In no instance did the rarefaction curves of stations A4, A1-5m, and A1-80m reach a clear saturation, indicating that further sampling of these clone libraries would have revealed additional diversity. The high levels of diversity of the A4, A1-5m, and A1-80m clone libraries were also reflected in the larger numbers of different RFLP types. So an underestimation of species diversity from A4 and A1 surface and 80-m depth is expected, as the coverage of the libraries was estimated to be 60%, 67%, and 63%, respectively (Table 2). Histograms of the pattern frequencies are presented in Fig. 2. All the clone libraries investigated contained several RFLP patterns that occurred repeatedly and a predominance of single, unique RFLP patterns (the right side of each panel in Fig. 2A). Several RFLP patterns were only present in one of the five established libraries. A1-200m and A10 were clearly dominated by several pattern types, whereas A1-5, A1-80m, and A4 had several more abundant members (Fig. 2B) Clones with RFLP types that occurred more than once were selected from the five nasA gene libraries and then sequenced. More clones which had the same RFLP pattern were sequenced ensuring that the same PFLP pattern clone had an identical sequence. Phylogenetic Analysis based on nasA Sequence Data Comparisons with the National Center for Biotechnology Information (NCBI) database by BLAST searches showed that all sequences were clearly homologous with known nasA sequences. A total of 50 nasA clones from the different stations were analyzed. The nucleotide sequence similarities between the nasA clones and those in the database ranged from 81 to 98%, and the deduced amino acid sequence similarities ranged from 89 to 100% (data not shown). A phylogenetic tree was constructed from the sequences obtained from selected clones and reference genes (Fig. 4). The formate dehydrogenase of M. thermoformicicum was used as an outgroup for phylogenetic distance analysis of the nasA sequences because it was the most distantly related, confirmed nasA sequence exhibiting homology at the DNA and amino acid levels. However, the reference genes were mostly isolated from uncultured marine bacterial clones in the NCBI database. Therefore, more nasApositive marine bacteria need to be isolated and their nasA gene and 16S rRNA gene sequenced to give a clearer picture of the sequences in the clone library to which the bacteria belong. Trees showed three major clusters of nasA sequences with clones from marine samples clustering in three distinct subclusters (Fig. 4). One major cluster consisted of a subcluster of closely related nasA sequences of Synechococcus, the clones mostly from surface and 80-m depth layers in

756

a

b 25

Frequency (number of clones)

Figure 2 Histogram of RFLP pattern frequency (A) and corresponding dominated patterns (B) in the five clone libraries

H. Cai, N. Jiao

20 15 10 5 0

RFLP Pattern (A1-80m)


25 20 15 10 5 0

RFLP Pattern (nasA A1-200m)


25 20 15 10 5 0

RFLP Pattern (A4)


25 20 15 10 5 0

RFLP Pattern (A10)

station A1. In the second major cluster, the remaining nasA clones formed a cluster of mostly related nasA sequences of Cytophaga, mainly from stations A4 and A1. A third major cluster consisted of very distantly related nasA sequences from three environments. Five commonly identified groups by RFLP were the unknown groups which related to Alteromonas, Halomonas, Marinobacter, Marinomonas, Psychrobacter, and Vibrio in the third major cluster (Fig. 4), which were confirmed by RFLP and subsequent sequence analysis. It is interesting to note that the majority of the nasA sequences were members of the gamma-Proteobacteria, except for several clones, for

example, A10-22 and A1-80m-53, both of which had only 76% nucleotide similarity value to the sequences of Roseobacter denitrificans OCh 114, and 97% and 80% amino acid similarity values separately to the sequences of Roseobacter denitrificans OCh 114. Another example was clone A1-80m-32, which only had 70% and 68% nucleotide and amino acid similarity values separately to the sequences of Rhodobacter capsulatus; and a further example was clone A10-17, which only had 69% and 70% nucleotide and amino acid similarity values separately to the sequences of Mesorhizobium loti. The low similarity values in nucleotide and amino acid clearly demonstrated

Diversity and Abundance of Nitrate Assimilation Genes

757

Table 2 Diversity indices obtained for nasA libraries from SCS samples

S C (%) Shannon-Weaver Simpson Evenness

A1–5m

A1–80m

A1–200m

33 67 3.109 0.9307 0.6788

38 63 3.183 0.9278 0.6349

24 76 2.418 0.8461 0.4679

high unexpected diversity of the nasA gene in the alphaProteobacteria. Phylogenetic Analysis of nasA-positive Bacterial Isolates based on nasA Gene and 16S rRNA Sequences To compare with the sequences obtained from the clone libraries, we isolated bacteria from samples of the three stations using NDGA and NFGA medium. Within the total 120 strains isolated in situ using NDGA media, 45 strains had the ability to utilize NO3− as a sole N source (NFGE medium plus 10 mM NO3−), and also were PCR positive for nasA, which was consistent with the result of Allen et al. [1]. Among the nitrate-assimilating (nasA-positive) isolates whose 16S rRNA genes had been sequenced, the majority

expected number of sequences types

a 40 35 30 25

surface 80m 200m

20 15 10 5 0

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 number of rarefied clones

b

Figure 3 Rarefaction curves of observed diversity of nasA RFLP types in the nasA gene libraries. (A) Rarefaction curves obtained at station A1 with different depth; (B) rarefaction curves obtained at different stations

A4

A10

36 60 3.078 0.9144 0.6031

23 77 2.358 0.8292 0.4598

were Pseudoalteromonas sp., Vibrio sp., Psychrobacter sp., and Halomonas sp. (Fig. 6), and the isolates that had the same nasA gene and 16S rRNA gene were omitted. Pseudoalteromonas sp. was distributed at the three stations, but not detected in the 200-m layer of station A1, which is consistent with the fact that only four clones were detected in the 200-m layer of the station A1 clone library; Vibrio sp. was mainly isolated from surface water of A4 and A10, and accounted for 10% of the A10 in the clone library; Psychrobacter sp. was mainly discovered in station A4, and also detected in A10 and A1; whereas the uncultured clones whose nasA sequences were similar to Psychrobacter sp. only accounted for 7% in the A4 clone library. Halomonas sp. was isolated only in the 80-m and 200-m layer in station A1; and finally, the clones whose nasA sequences were similar to Halomonas sp. only accounted for 3% of the A1-80m clone library. With the closest relative nasA sequences included, a neighbor-joining phylogenetic tree was inferred based on the assimilatory nitrate reductase gene (nasA) sequences of the isolates (Fig. 5). The tree obtained could be divided into six groups based on the location of nasA. The incongruity of 16S rRNA and nasA gene sequences became obvious when we compared phylogenetic affiliations based on the 16S rRNA genes (Fig. 6) with those based on the nasA gene of the isolates (Fig. 5). For example, the two isolates in Group I Pseudoalteromonas and the six isolates in Group II Pseudoalteromonas formed two separate monophyletic groups in the nasA analysis, although the similarity values of their 16S rRNA gene sequences with already known sequences (Pseudoalteromonas sp.) were remarkably high, ranging from 98 to 99%, and formed a monophyletic group. Another example was Group I and II Vibrio, where two known nasA sequences from public database sequences were included in Group I Vibrio, whereas the Group II Vibrio included six isolates whose 16S rRNA gene sequences had remarkable similarity values (ranging from 98% to 99%) with Vibrio sp., which were supported by phylogenetic analysis. In the Halomonas and Flavobacterium group, three isolates, which belonged to Halomonas sp., and one isolate, which belonged to Flavobacterium sp., formed a monophyletic group with strong bootstrap support (Fig. 5). However, in the Psychrobacter group, nasA trees

758 Figure 4 (a) Phylogenetic relationships of nasA gene sequences. (b) Alteromonas Group. A consensus tree was constructed by distance (neighbor-joining), maximum parsimony, and maximum likelihood methods by introducing multifurcations (dashed lines) where tree topology was not consistently resolved. Bootstrap values were generated from 1,000 replicates of neighbor joining and parsimony analysis and 100 replicates of maximum likelihood; bootstrap values >90%, ●; bootstrap values 50 to 90%, ○; bootstrap values of 90%; ○, bootstrap values 50 to 90%; bootstrap values of 90%; ○, bootstrap values 50 to 90%; bootstrap values of