R e vie w - CiteSeerX

34 downloads 0 Views 5MB Size Report
label viruses to enhance detection in cell cul- .... since the amplification of the virus in cell culture facilitates ..... novel polyomavirus associated with Merkel cell.
For reprint orders, please contact: [email protected]

Patrick Tang† & Charles Chiu Author for correspondence: British Columbia Centre for Disease Control, 655 West 12th Avenue, Vancouver, BC, V5Z 4R4, Canada n Tel.: +1 604 707 2616 n Fax: +1 604 707 2675 n [email protected]

Modern laboratory techniques for the detection of novel human viruses are greatly needed as physicians and epidemiologists increasingly deal with infectious diseases caused by new or previously unrecognized pathogens. There are many clinical syndromes in which viruses are suspected to play a role, but for which traditional microbiology techniques routinely fail in uncovering the etiologic agent. In addition, new viruses continue to challenge the human population owing to the encroachment of human settlements into animal and livestock habitats, globalization, climate change, growing numbers of immunocompromised people and bioterrorism. Metagenomics-based tools, such as microarrays and high-throughput sequencing are ideal for responding to these challenges. Pan-viral microarrays, containing representative sequences from all known viruses, have been used to detect novel and distantly-related variants of known viruses. Sequencing-based methods have also been successfully employed to detect novel viruses and have the potential to detect the full spectrum of viruses, including those present in low numbers.

Collectively, we face a future where emerging viral pathogens will become more prevalent, fueled by advances in medical therapies, changes in the environment, globalization and better disease surveillance. Modern immunosuppressive therapies, targeting neoplastic and certain chronic diseases, have created new opportunities for viruses that are normally nonpathogenic. As the barriers between animals and humans are removed by the expansion of the human population, and global warming extends the range of certain vectors and pathogens, new viruses will be regularly introduced into the human population. Compounding these forces is globalization, which facilitates the rapid spread of viruses from one continent to another. Recent examples include the West Nile Virus, which has emerged in North America [1] ; the SARS coronavirus, which triggered a worldwide epidemic [2–4] ; and the swine-origin influenza A/H1N1 virus, which has become the latest pandemic virus [5–9] . In addition, there exist many common clinical syndromes, such as respiratory, gastrointestinal and encephalitic infections, in which many of the causative agents are believed to be viruses but extensive conventional diagnostic tests have been unable to identify specific pathogens [10–15] . Finally, the specter of bioterrorism is ubiquitous, as the knowledge and tools for creating novel, more effective biological agents become more widespread [16–19] . Enhancements in our disease surveillance networks and techniques have allowed us to rapidly identify more instances of emerging infectious 10.2217/FMB.09.120 © 2010 Future Medicine Ltd

diseases [20–24] . However, the traditional paradigm of finding the unknown causes of these diseases relies upon diagnostic tests for known agents. This makes it difficult and inefficient, and sometimes impossible, to identify unexpected or novel viral pathogens using conventional methods. In contrast, more comprehensive metagenomics techniques, such as microarrays and high-throughput sequencing, are ideally suited for the task of systematic virus discovery. Although conventional methods for virus detection, such as virus culture, electron microscopy, serology and PCR, have been used successfully for identifying new viruses, these methods each have limitations for systematic virus discovery. Many viruses cannot be amplified in cell culture, or will not exhibit characteristic cytopathic effects during their growth [25] . Electron microscopy is a relatively insensitive method for virus detection and provides only morphologic clues regarding the identity of the virus [26] . Sera from previously infected hosts can be used to label viruses to enhance detection in cell culture or electron microscopy, but sera containing high titers of specific viral antibodies can be difficult to obtain [26,27] . PCR targeting conserved genetic regions can be used to detect variants of known viruses, but may not be able to detect more divergent or completely novel viruses for which there exists no a priori sequence data [28] . These conventional tests can be combined and run together, or in series, to detect a wider range of pathogens, but this approach can be costly, inefficient and time consuming [29,30] . Future Microbiol. (2010) 5(2), 177–189

Review

Future Microbiology

Metagenomics for the discovery of novel human viruses

Keywords high-throughput sequencing metagenomics n microarrays n novel viruses n virus n virus detection n n

part of

1746-0913

177

Review

Tang & Chiu

Metagenomics strategies to virus discovery typically employ an algorithmic approach (Figure 1) . Singleplex PCR assays run in parallel, or multiplex PCR assays, can be used to immediately screen for the most common viral pathogens associated with the infectious disease being studied [31,32] . The resulting PCR products from the various pathogens are resolved using gel electrophoresis, differentially labeled probes or microarrays to identify the virus present in the sample [33–35] . Samples that test negative for the common viruses are then tested by a broadrange virus detection assay, such as a pan-viral microarray, which is capable of detecting viruses

that are formally represented on the microarray, as well as novel variants of these viruses [36–38] . Any viruses detected by these methods can then be fully sequenced to detect potential variants of known viral species [39] . If viruses are not detected by PCR or microarray, samples are then subjected to high-throughput sequencing to detect novel viruses that do not have significant sequence homology to any known viruses [40,41] . Prevalence and association studies can be performed to link novel viruses identified by microarray or high-throughput sequencing with disease [42] . Based on the current costs for PCR, microarray ana­lysis and high-throughput

Clinical specimen

-

Single or multiplex PCR

+

Screen for sequence variants and perform epidemiological surveillance

-

Pan-viral microarray

+

A G C G T TA -

High-throughput sequencing

+

Whole-genome sequence recovery of novel virus

-

Data analysis and archival

Screening and association studies

Figure 1. Algorithmic approach to novel virus discovery. Clinical specimens that are suspected to contain viruses, but which test negative for pathogens by conventional microbiology tests, are screened for common viruses through PCR. If common viruses are present, these are sequenced to determine whether they are novel variants and to facilitate molecular epidemiology investigations. Specimens which are negative for common viruses are hybridized to a pan-viral microarray to detect a broader range of known and unknown viruses. If viruses cannot be found through microarray techniques, then the sample is subjected to high-throughput sequencing to search for viral sequences. Any viral signatures detected by microarray or high-throughput sequencing will lead to attempts to recover the entire virus genome sequence in order to better characterize the novel virus.

178

Future Microbiol. (2010) 5(2)

future science group

Metagenomics for the discovery of novel human viruses

sequencing, this algorithmic approach to virus detection and discovery allows for the systematic detection of a potentially unlimited range of viruses in the most cost-effective, yet comprehensive, manner. As the relative costs of these tests change and new technologies are made available, the algorithm will naturally evolve to reflect these changes. New viral diseases

The first clue to a new viral disease depends upon an astute clinician recognizing an unusual pattern of signs and symptoms that do not match those of known infectious diseases [43] . Alternatively, there may be a patient with a previously recognized clinical syndrome for which laboratory tests are negative for the known etiologic agents. Clusters of a new infectious disease may be recognized by epidemiologists using traditional or novel surveillance tools, such as analyzing worldwide news feeds, internet search-engine trends and open-source real-time surveillance systems [20–24] . Active laboratory surveillance of human, animal and environmental samples may detect new or unusual viruses, which will lead to additional studies to determine their role in human health [44–47] . From specimens submitted for diagnostic testing, the clinical microbiology laboratory may identify unusual laboratory findings that indicate the presence of a novel virus. There are also large collections of banked samples in most clinical laboratories for which conventional tests were unable to identify a pathogen, but which potentially harbor novel viruses. In the search for new viral diseases, the safer hunting grounds include many acute infectious diseases of unknown etiology [43] . The probability of finding new viruses in diseases, such as acute gastroenteritis, respiratory tract infections, meningitis and encephalitis, is high [10–15] . Many viruses are known to cause these syndromes, and the remaining fraction of cases of unknown etiology may also be largely attributed to as yet undiscovered viruses. As our techniques for virus discovery improve, we will enhance our ability to target viruses in the more perilous hunting grounds of idiopathic, neoplastic, autoimmune, inflammatory, endocrine and other chronic diseases, where the associations between pathogen and disease are more difficult to establish [43,48–52] . Environmental and host genetic factors may also play a significant role in the pathogenesis of these diseases, and complex studies may be required to ascertain whether virus infection is the primary cause or whether the etiology of the disease is multifactorial [53] . future science group

Review

Other efforts to identify new viral diseases target animals as sources of new and emerging human viruses [45,46] . Monitoring for novel viruses at the animal–human interface could provide an early warning system for new infectious diseases before they spread further into the human population. It is likely that a large pool of novel human viruses still exists owing to both viral evolution and undiscovered viruses in animals, thus supporting the role for active virus ­discovery programs [46,54] . The scarcity of effective antiviral therapies has been one major argument against the utility of efforts to discover new viral agents. However, it is clear that to begin development of new antiviral therapies, the viral etiologies of many idiopathic diseases must first be uncovered and the epidemiology of these new viruses understood [55] . Furthermore, even in the absence of effective antiviral drugs, it may be possible to discontinue unnecessary antibiotics and avoid additional diagnostic tests once the etiologic agent is identified, thus reducing adverse effects from unneeded drugs and reducing overall healthcare costs [56,57] . The identification of a novel virus also allows for the development of new viral detection and serologic assays to better understand the epidemiology and pathogenesis of the new infectious disease [58–60] . Finally, the expectations of clinicians and patients are greater as the tolerance for diagnostic uncertainty decreases in an age where information is both widely available and immediately accessible [61] . Traditional strategies

Although new metagenomics techniques promise to dramatically increase our rate of virus discovery, traditional laboratory techniques in virology have served well for virus discovery, even in modern times. The cornerstone of virus discovery has been the use of cell culture techniques, since the amplification of the virus in cell culture facilitates downstream ana­lysis with other laboratory methods, including electron microscopy, serology and PCR [62] . These other methods often complement the use of virus culture, providing additional evidence that the novel virus is the etiologic agent rather than a tissue culture contaminant or a transient virus unrelated to the disease [27,63,64] . The electron microscope has made important contributions to virus discovery, from historically being the primary method of virus detection in clinical samples, such as in the discovery of Norwalk virus [65,66] , to a rapid tool for identifying new viruses in cell culture, such as in the initial identification of www.futuremedicine.com

179

Review

Tang & Chiu

SARS coronavirus, Melaka and Saffold viruses from infected cell cultures displaying cytopathic properties [4,67,68] . With morphologic information from electron microscopy, it is then possible to narrow the identity of the virus into select taxonomic groups, thus allowing for the use of PCR targeting conserved genetic regions within these groups of viruses [69,70] . Serologic methods have been combined with electron microscopy, as was the case in the discovery of hepatitis B virus, where serum from infected patients was used to label virus particles for immuno-electron microscopy [71,72] . Immune serum has also been used to screen peptide expression libraries generated from samples infected with an unknown virus; the classic application of this method was the discovery of hepatitis C virus from patients with non-A, non-B hepatitis [73] . Nevertheless, the systematic use of these traditional methods for virus discovery is problematic, as they are often labor intensive and lack sufficient analytical sensitivity compared with modern molecular detection techniques. Viral nucleic acid extraction

Perhaps the most crucial step in the use of molecular techniques for virus discovery is the efficient extraction of viral nucleic acid from the clinical specimen [74–76] . Both the absolute and relative concentrations of viral relative to nonviral nucleic acids determine the success of subsequent molecular assays. For clinical specimens that are solid or viscous in nature, homogenization or enzymatic digestion of the specimen is necessary to liberate virus particles for extraction [77–79] . Various filtration and differential centrifugation techniques can be used to further concentrate virus particles [29,80,81] . The genomes of intact virus particles are protected by nucleocapsids, and endonuclease treatment can be used to remove unprotected RNA and DNA also present in the sample [82] . All of these steps increase the absolute and relative concentration of viral nucleic acids prior to nucleic acid extraction. Many manual and automated methods are available for nucleic acid extraction and the selection of the best method will need to balance extraction efficiency and throughput [83,84] . Unfortunately, viruses which exist in episomal forms or that have integrated into the host genome will not be readily amenable to these purification strategies. Additional methods to reduce the molecular complexity of such nucleic acid samples include selection for polyadenylated RNA [85] . The concentration of viral nucleic acids and reduction of sample complexity serve to increase the sensitivity of nucleic 180

Future Microbiol. (2010) 5(2)

acid amplification tests, subtraction methods, microarrays and high‑throughput sequencing for novel virus discovery. Nucleic acid amplification tests

Nucleic acid amplification tests, such as PCR, have traditionally been used for the detection of known viruses. The use of PCR requires a priori knowledge of the virus genome and, thus, precludes its use for the discovery of viruses whose genomes are highly divergent from that of known viruses [86] . However, novel variants of known viral species can be discovered through PCR detection followed by genome sequencing. New types of coronaviruses, herpesviruses, enteroviruses, parechoviruses and rhinoviruses have been identified in this way [87–92] . Designing degenerate or specific PCR primers targeting conserved targets within each viral taxon enhances the ability to detect more members of that group of viruses [93] . For example, the consensus-degenerate hybrid oligonucleotide primers (CODEHOP) methodology employs degenerate primers that target stable amino acid sequence motifs, while taking in account ­redundancies in codon sequences [28,94] . In the algorithmic approach to virus discovery (Figure  1) , PCR is useful for detecting the most common viral causes of a particular clinical syndrome. Samples already containing a common viral pathogen known to cause that syndrome are less likely to also yield additional novel viruses and, in most cases, can be excluded from further ana­lysis by pan-viral microarray or high-throughput sequencing [39,41] . In particular, multiplex PCR assays are a cost-effective and efficient way to detect common viral pathogens [31] . Several methods can be used to resolve the amplicons produced by multiplex PCR, including electrophoretic separation, multiple fluorescent probes, bead-based probes, mass ­spectroscopy and microarray [33–35,95–97] . Differential display & subtraction techniques

Methods to study differential gene expression are also applicable for the discovery of new viral pathogens [86,98–100] . In general, these methods require infected and uninfected samples, such that common genetic sequences between the two sets of samples are ignored or removed. For optimal efficiency, the two sets of samples must be highly related, such as being the same tissue type or derived from the same individual, and essentially identical, except for the presence or absence of the unknown virus [101] . In samples future science group

Metagenomics for the discovery of novel human viruses

where the relative abundance of viral nucleic acid is high, differential display methods may be able to reveal differentially expressed bands specific to the virus [102] . These methods first involve the use of nonspecific amplification techniques to create a representative library of each sample. The libraries are then subjected to gel electrophoresis and bands unique to the infected samples are sequenced to identify the virus. Subtractive hybridization or representational difference ana­ lysis (RDA) are alternative strategies that can be used for viruses that are less abundant in the sample [98,99] . For both of these strategies, uninfected driver DNA (or cDNA) is added in excess to a limited amount of infected tester DNA. In subtractive hybridization, single or multiple rounds of subtraction are performed to remove DNA species common to both the driver and tester sets. The remaining single-stranded DNA species unique to the infected samples can be cloned and sequenced to determine their identity. In RDA, the subtraction is combined with PCR amplification, such that only unique sequences in the tester sample are preferentially amplified. This process can be repeated to further enrich for differentially expressed PCR products, which can then be cloned and analyzed. RDA has been used to discover human herpesvirus 8 from Kaposi’s sarcoma lesions in HIV-infected patients [103] . Overall, these methods are laborious and relatively complex to perform, while lacking the sensitivity of high-throughput sequencing and computational subtraction methods. Pan-viral microarrays

Microarrays can be used for the identification of a defined set of known viruses using specific or random PCR primers, the resequencing of known viruses to identify sequence variants, or for the discovery of novel viruses using pan-viral microarrays [10,36,37,104–106] . Pan-viral microarrays attempt to represent all known viruses using tens of thousands of oligonucleotide probes. Known viruses can be detected on the microarray, as well as novel viruses with at least some degree of relatedness to known viruses [37] . The ability of the microarray to identify divergent viruses is based on a balance between the stringency of the hybridization conditions and the signal of the virus relative to background noise on the array. Although pan-viral microarrays can test for all viruses simultaneously, the requirement for nonspecific amplification by random priming reduces their analytical sensitivity, as compared with assays that detect a single or few agents at one time [39] . future science group

Review

Two of the more widely used pan-viral microarrays for viral discovery and diagnostics are the Virochip and the GreeneChip [36–38] . Both designs employ 70-mer oligonucleotide probes that target conserved genetic regions within each taxonomic group of viruses, for which sequence data is available. The increased length of the 70-mer oligonucleotide probes makes them more tolerant of mismatches and, thus, more suitable for the detection of novel viruses [36] . A large number of probes are required to provide redundancy and to generate nuances in hybridization patterns that differentiate between known and unknown viruses. The GreeneChip takes advantage of the Protein Families database of alignments in GenBank to generate viral probes, whereas the Virochip utilizes evolutionarily conserved sequences in the viral genomes represented in GenBank [36,107] . In addition, the third and fourth (current) iterations of the Virochip also employ oligonucleotide probes that are specific to individual virus species and subtypes, thus allowing for the enhanced identification of viruses at the species and subspecies levels, as well as increasing its power to differentiate novel from known viruses [108,109] . As new viruses are sequenced and added to the database, the probes on the microarray are regularly updated to reflect the new taxonomic landscape. After the extraction of nucleic acids, reverse transcription and random amplification are used to amplify the genetic material in the clinical sample. For pan-viral microarrays, random PCR amplification is commonly used, although linear amplification strategies (e.g., PhiХ29 strand-displacement or T7-based amplification) may enable quantitative viral detection [110,111] . The use of random amplification creates a problem for the detection of rare viruses when the amount of nonviral nucleic acids is high, such as the detection of viruses in tissue samples. To address this limitation, numerous preanalytical strategies have been employed to increase the virus concentration prior to random amplification. These include physical and enzymatic methods to separate viruses from human cells and bacteria, as outlined in the previous section. In diseases and samples where the concentration of viruses is high relative to other cells, such as respiratory secretions in acute respiratory infections or stool in acute gastroenteritis, the clinical sensitivity of the viral microarrays is comparable to that of specific PCR assays for the same viruses [112] . The use of other target amplification strategies for broad-spectrum microbial detection is covered in greater detail in an article by Leski et al. in this issue [113] . www.futuremedicine.com

181

Review

Tang & Chiu

The large number of probes, along with the occurrence of both predicted and unpredicted cross-hybridization, necessitates the use of computer algorithms and statistical methods to identify virus hybridization patterns on the microarray. One computational strategy is E-Predict, an algorithm that compares observed hybridization patterns on the Virochip to a database of theoretical virus hybridization patterns [114] . Theoretical hybridization patterns are generated through in silico determination of the free energy of hybridization of known viruses to each probe on the microarray. The statistical significance of each virus prediction can be calculated from the theoretical hybridization patterns, and modified in a Bayesian fashion from past cumulative microarray data. To enhance the sensitivity of detection, the statistical significance of the observed intensity of each oligonucleotide probe can also be calculated in reference to a set of control and historical samples. The ranking and taxonomic relationships of the most statistically significant oligonucleotides can then be used to determine the identity of the virus [57] . Similar algorithms also exist for determining the significance of pathogen detection microarray data, such as PhyloDetect [115] and DetectiV [116] . The Virochip has been used in the identification of the SARS coronavirus and for recovery of the first kilobase of genomic sequence [37,117] . Subsequently, it has detected many novel viruses including: n A murine retrovirus associated with prostate cancer [58] A new clade of rhinoviruses [112]

n

An avian bornavirus causing proventricular dilatation disease in birds [109]

n

A coronavirus in a beluga whale [118]

n

A novel human cardiovirus [108] Both the Virochip and Greenechip have been demonstrated to identify a wide range of viral respiratory pathogens [10,112] , including cases of bronchiolitis and pneumonia in critically ill patients after extensive conventional laboratory testing failed to yield a diagnosis [56,57] . n

Sequence-independent amplification

Sequence-independent amplification methods enable the unbiased detection of highly divergent or novel viruses that cannot be identified by other techniques. The general algorithm begins with nucleic acid amplification using various methods, including randomly primed PCR or restriction endonuclease digestion of the sample DNA 182

Future Microbiol. (2010) 5(2)

and ligation of adaptor sequences, followed by PCR amplification. RNA can be analyzed with a reverse transcription step prior to amplification. After sequence-independent amplification, subcloning and sequencing of the amplified fragments, homology searches in GenBank are conducted to identify sequences that possess similarities to known viruses. Fragments that do not match any known sequence, or that are distantly related to known viral sequences, are of particular interest as they may correspond to novel viruses. Traditionally, sequence-independent amplification methods are best applied to clinical samples, such as body fluids (e.g., respiratory secretions, stool, blood, cerebrospinal fluid and urine). These samples can be highly purified for viruses by filtration, ultracentrifugation, or endonuclease treatment, whereby free nucleic acids are degraded but viral nucleic acids remain protected within the proteinaceous capsid. Examples of sequence-independent amplification methods that have been used for virus discovery include sequence-independent single primer amplification and the adaptation of these techniques for high-throughput sequencing. Sequence-independent amplification methods have been successful in detecting many novel viruses over the past decade, including parvoviruses in human sera [119] , human metapneumoviruses, bocaviruses and polyomaviruses in respiratory secretions [120–123] , and human adenoviruses, parechoviruses, bocaviruses, ­cosaviruses and kobuviruses in stool [124–128] . High-throughput sequencing

The development of high-throughput secondgeneration sequencing technology has greatly impacted the field of virus discovery by enabling ‘deep’ sequencing of clinical samples with the production of hundreds of thousands to millions of sequence reads per run [40,129–131] . This is a sophisticated yet ‘brute force’ method to detect sequences corresponding to novel viruses that may be present at exceedingly low titers in clinical specimens, or may be too divergent from known viruses to be detected by PCR or microarray techniques [40,132] . High-throughput sequencing is a rapidly evolving field, and several new third-generation sequencing systems are already in development [129,133] . The two platforms most commonly used for virus discovery applications include the pyrosequencing system of 454 Life Sciences/Roche (CT, USA) and the ‘sequencing-by-synthesis’ system of Solexa/Illumina (CA, USA) [134,135] . For the detection of novel viruses, critical parameters of any high-throughput sequencing platform include the average read length future science group

Metagenomics for the discovery of novel human viruses

and total number of sequence reads generated. The sequences must be long enough to permit unique identification of sequences corresponding to novel viral genomes, and there must be enough reads generated for adequate sequencing depth. Although high-throughput sequencing methods provide unparalleled depth of sequencing, strategies to reduce host background sequences and purify viral nucleic acid sequences remain essential for detection sensitivity. One example is the recent discovery of a novel polyomavirus associated with Merkel cell carcinoma by massively parallel pyrosequencing [132] . Despite selection for polyadenylated RNA, and ana­lysis of over 400,000 sequences generated from tissue biopsy specimens, only two sequences were found to correspond to the novel polyomavirus. The Roche Genome Sequencer FLX™ can currently generate up to 1 million sequence reads per run with an average read length ranging from 200 to 450 bp, while the Illumina Genome Analyzer II can generate up to 80 million sequence reads per run, with fixed paired-end read lengths of 100–140 bp (paired-end reads are sequences generated from the opposite ends of the same individual DNA molecule). Samples can also be barcoded so that multiple samples can be analyzed per run with proportionally fewer sequences generated per specimen [136,137] . The ana­lysis of high-throughput sequencing data can be computationally challenging. A customized bioinformatics pipeline used in our laboratory for virus discovery from high-throughput sequencing reads is shown in Figure 2  [131,135] . Raw sequence reads are trimmed to remove low-complexity and duplicate reads and classified by sample barcode. Overlapping reads are computationally assembled and the resulting unique contiguous sequences (contigs) are subjected to similarity searches to nonredundant organism sequence databases. Sequence reads corresponding to human, bacterial and other nonviral organisms are computationally subtracted from the dataset. The remaining sequence reads are then compared with viral databases, using programs that examine homology, at first at the nucleotide and then at the amino acid levels (BLASTn, BLASTx and tBLASTx). Detected viral sequences corresponding to known viruses are taxonomically classified, and attempts are made to assemble viral contigs or genomes from sequences corresponding to these novel viruses. Finally, sequences that do not align to any viral or nonviral database are examined using future science group

Review

a remote homology detection program (such as PHI‑BLAST) for the presence of extremely divergent viruses [138] . Recently, the use of high-throughput sequencing technology has led to the identification of several novel viruses – an arenavirus related to lymphocytic choriomeningitis virus in transplant recipients [40] , a hemorrhagic fever virus from South Africa [134] , a human kobuvirus [135] and the aforementioned Merkel cell polyomavirus [132] . Given the high costs and lengthy times for sample preparation and data ana­lysis, unbiased high-throughput sequencing, at this point, is primarily used for clinical samples that test negative for known viral agents by all other methods. In the future, with rapid advances in technology, decreased sequencing costs and sample multiplexing, high-throughput sequencing may be a powerful tool, not only for viral discovery, but also for viral diagnostics in the clinical as well as research settings. No hits (unknown sequence reads) Assemble novel viral contigs or genomes

Classify detected viral sequences

Input raw sequence reads

High-throughput sequencing pipeline for virus discovery

Search viral database for sequence similarities

Remove human, bacterial and other nonviral reads

Trim low-complexity and duplicate reads; classify by barcode

Assemble overlapping reads

Search nonredundant database for sequence similarities

Figure 2. High-throughput sequencing pipeline for virus discovery. From the raw sequence data, low complexity and duplicate reads are first removed and any sample barcoding is resolved. De novo assembly of overlapping reads is performed, followed by subtraction of all nonviral sequences. The remaining sequences are aligned to existing viral sequence databases to classify and assemble novel viral sequences. Other methods are used to interpret the ‘no hits’ sequences, which have no similarity to either viral or nonviral sequences.

www.futuremedicine.com

183

Review

Tang & Chiu

Evidence of causation

The discovery of a novel virus in a clinical sample from an individual with an acute or chronic disease does not imply causation or even bona fide association of the virus with the disease. Much of the difficulty in assigning an etiologic role to newly discovered viruses comes from the fact that viruses routinely fail Koch’s

postulates for causality [49] . Many viruses cannot grow in culture or else have a restricted host range, thereby making it impossible or unethical to attempt experimental human infection and re-isolation of the virus. Alternatives to Koch’s postulates include serological ana­lysis, in which the identification of specific antibodies to the novel virus may imply a causal relationship, or

Executive summary New viral diseases Changes in the environment, human demographics, globalization and immunosuppressive therapies allow for the introduction of new viruses into the human population. n Many chronic diseases may have an unrecognized viral etiology. n Many human viruses remain undiscovered. n

Traditional strategies Electron microscopy, virus culture and serology have been used to discover many viruses, even in modern times. However, these methods are often laborious and lack analytical sensitivity in comparison with modern molecular detection techniques, such as nucleic acid amplification tests.

n n

Viral nucleic acid extraction Efficient concentration and recovery of viral nucleic acids from clinical specimens is the most crucial step in all molecular detection assays.

n

Nucleic acid amplification tests New members of the known virus families have been discovered through PCR tests targeting conserved regions in their genomes. The algorithmic approach to virus discovery involves the use of PCR to rule out the most common viral etiologic agents, followed sequentially by pan-viral microarray ana­lysis and high-throughput sequencing.

n n

Differential display & subtraction techniques These methods required two sets of related samples, one potentially infected by the virus, and one that is uninfected. However, they can be more laborious than the use of microarrays and lack analytical sensitivity relative to high-throughput sequencing.

n n

Pan-viral microarrays Pan-viral microarrays typically use long oligonucleotides that are more tolerant of mismatches and, thus, more suited to detecting novel viruses. n Pan-viral microarrays employ a large number of probes, representing all known plant, animal and human viruses. n Probes are designed to target both conserved genetic elements and species-specific regions in viral genomes. n Computational algorithms are required to interpret the hybridization patterns on these arrays. n

Sequence-independent amplification Sequence-independent amplification methods allow for the detection of novel viruses that are too divergent to be detected by other methods. n These methods use random priming or the ligation of adaptor sequences to the sample DNA (or cDNA) in order to amplify the nucleic acids in an unbiased manner. n The resulting library is then sequenced to determine the presence of new viruses. n

High-throughput sequencing High-throughput sequencing is a sophisticated, yet ‘brute force’ method to detect novel viral sequences in clinical samples. Hundreds of thousands to millions of sequences must be computationally analyzed to determine the presence of novel viruses, requiring significant computational processing power and storage. n These methods are still relatively costly and, thus, are only used further downstream in the algorithmic approach to virus discovery. n n

Evidence of causation Fulfillment of Koch’s postulates is often difficult for novel viruses and culture-independent methods are necessary to provide evidence of causality. n However, the amplification of the virus, either through culture or generation of an infectious clone, remains the most essential tool for proving disease causation. n

Future perspective Systematic approaches to virus discovery will continue to evolve as rapid advances in molecular detection platforms and metagenomics techniques continue. n These methods will also allow us to characterize and better understand the human virome and its role in human health and disease. n

184

Future Microbiol. (2010) 5(2)

future science group

Metagenomics for the discovery of novel human viruses

epidemiological criteria, such as whether the nucleic acid sequence corresponding to the virus is present in most cases of an infectious disease, or whether virus genome copy number correlates with severity of disease or pathology [64,139,140] . For ubiquitous viruses (e.g., Epstein–Barr virus, human herpesvirus 6 and torque teno virus ) or diseases in which host genetics and environmental factors play a significant role in the development of disease, proof of viral causation may be exceedingly difficult to obtain [141–143] . The ability to amplify a novel virus, either through cell culture directly from clinical samples or construction of an infectious clone, can be helpful not only in assessing causality but also in the detailed characterization of the virus. Cell culture experiments can be performed to assess host range and tissue and cell tropism. The mechanism of pathogenesis can be ascertained through studying the host–virus interaction at the molecular level. Concentrated titers of the virus can be used in ultrastructural studies or to produce antigens to develop assays for determining seroprevalence of the novel virus, either via the classical method of virus neutralization or via enzyme immunoassays. By providing a wealth of molecular and epidemiological data, these culture-dependent studies typically produce the most unambiguous evidence supporting the etiologic role of the novel virus in the infectious disease [144] . Future perspective

As the technology for metagenomic ana­lysis continues to advance, so will our approach to virus detection and discovery. Automation and miniaturization of nucleic acid amplification and detection platforms, incorporation of microfluidics, and development of more sensitive hybridization methods all hold the promise of moving

advanced molecular technology into the realm of point-of-care testing by clinicians [145–147] . The rapidly evolving techniques of next-generation sequencing will soon provide more affordable, higher throughput, single-molecule sequencing, such that sequence-based detection and diagnosis may eventually supplant nucleic acid amplification and microarray hybridization techniques  [148] . Along with advances in analytical techniques, new methods for preanalytical sample treatment, such as dielectrophoresis for the separation of viral particles, and more efficient methods to concentrate nucleic acids from samples will be crucial tools to improve the rates of virus detection and discovery [149,150] . Our enhanced ability to identify and characterize more viruses, many of which still remain unknown, will allow us to appreciate the full extent of their interactions with humans and other living organisms [46,151] . Akin to efforts to characterize the ocean viromes and the human microbiome, similar virus surveys must also be done to determine the human virome [29] . Although not formally represented in the tree of life, viruses should be considered an essential part of the human metagenome, analogous to our growing recognition of the role of the human microbiome in health and disease [53] . Financial & competing interests disclosure

Charles Chiu is the Director of the UCSF-Abbott Viral Diagnostics and Discovery Centre. The Centre is funded by a research award from Abbott Diagnostics, Inc. (IL, USA). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. No writing assistance was utilized in the production of this manuscript.

4.

Peiris JS, Lai ST, Poon LL et al.: Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet 361(9366), 1319–1325 (2003).

Lanciotti RS, Roehrig JT, Deubel V et al.: Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States. Science 286(5448), 2333–2337 (1999).

5.

CDC: CfDCaP Swine influenza A (H1N1) infection in two children – Southern California, March–April 2009. MMWR Morb. Mortal. Wkly Rep. 58(15), 400–402 (2009).

2.

Drosten C, Gunther S, Preiser W et al.: Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N. Engl. J. Med. 348(20), 1967–1976 (2003).

6.

3.

Ksiazek TG, Erdman D, Goldsmith CS et al.: A novel coronavirus associated with severe acute respiratory syndrome. N. Engl. J. Med. 348(20), 1953–1966 (2003).

7.

Bibliography Papers of special note have been highlighted as: n of interest nn of considerable interest 1.

future science group

Review

8.

Perez-Padilla R, de la Rosa-Zamboni D, Ponce de Leon S et al.: Pneumonia and respiratory failure from swine-origin influenza A (H1N1) in Mexico. N. Engl. J. Med. 361(7), 680–689 (2009).

9.

Smith GJ, Vijaykrishna D, Bahl J et al.: Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature 459(7250), 1122–1125 (2009).

Dawood FS, Jain S, Finelli L et al.: Emergence of a novel swine-origin influenza A (H1N1) virus in humans. N. Engl. J. Med. 360(25), 2605–2615 (2009).

10. Chiu CY, Urisman A, Greenhow TL et al.:

Fraser C, Donnelly CA, Cauchemez S et al.: Pandemic potential of a strain of influenza A (H1N1): early findings. Science 324(5934), 1557–1561 (2009).

11. Denno DM, Stapp JR, Boster DR et al.:

www.futuremedicine.com

Utility of DNA microarrays for detection of viruses in acute respiratory tract infections in children. J. Pediatr. 153(1), 76–83 (2008). Etiology of diarrhea in pediatric outpatient settings. Pediatr. Infect. Dis. J. 24(2), 142–148 (2005).

185

Review

Tang & Chiu

12. Glaser CA, Gilliam S, Schnurr D et al.:

In search of encephalitis etiologies: diagnostic challenges in the California Encephalitis Project, 1998–2000. Clin. Infect. Dis. 36(6), 731–742 (2003).

24. Wilson K, Brownstein JS: Early detection of

isolation of viruses. In: Clinical Virology Manual. Specter S, Hodinka RL, Young SA (Eds). ASM Press, Washington, DC, USA, 27–42 (2000).

15. Svraka S, van der Veer B, Duizer E,

Dekkers J, Koopmans M, Vennema H: Novel approach for detection of enteric viruses to enable syndrome surveillance of acute viral gastroenteritis. J. Clin. Microbiol. 47(6), 1674–1679 (2009). 16. Cello J, Paul AV, Wimmer E: Chemical

synthesis of poliovirus cDNA: generation of infectious virus in the absence of natural template. Science 297(5583), 1016–1018 (2002). 17. Fraser CM, Dando MR: Genomics and future

microscopy: past, present and future. Biol. Cell 100(8), 491–501 (2008). n

engineering and biological weapons. New technologies, desires and threats from biological research. EMBO Rep. 4(57), S57–S60 (2003). 19. Tumpey TM, Basler CF, Aguilar PV et al.:

Characterization of the reconstructed 1918 Spanish influenza pandemic virus. Science 310(5745), 77–80 (2005). n

Reverse genetics used to resurrect the extinct 1918 pandemic influenza virus.

20. Espino JU, Wagner M, Szczepaniak C et al.:

Removing a barrier to computer-based outbreak and disease surveillance – the RODS Open Source Project. MMWR Morb. Mortal. Wkly Rep. 53(Suppl.), 32–39 (2004). 21. Freifeld CC, Mandl KD, Reis BY,

Brownstein JS: HealthMap: global infectious disease monitoring through automated classification and visualization of internet media reports. J. Am. Med. Inform. Assoc. 15(2), 150–157 (2008). 22. Ginsberg J, Mohebbi MH, Patel RS,

Brammer L, Smolinski MS, Brilliant L: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012–1014 (2009). 23. Madoff LC: ProMED-mail: an early warning

system for emerging diseases. Clin. Infect. Dis. 39(2), 227–232 (2004).

186

discovery and sequence recovery using DNA microarrays. PLoS Biol. 1(2), e2 (2003). n

Panmicrobial oligonucleotide array for diagnosis of infectious diseases. Emerg. Infect. Dis. 13(1), 73–81 (2007). 39. Quan PL, Briese T, Palacios G, Ian Lipkin W:

diagnostic virology. Ultrastruct. Pathol. 11(5–6), 681–685 (1987). 28. Rose TM: CODEHOP-mediated PCR – a

Rapid sequence-based diagnosis of viral infection. Antiviral Res. 79(1), 1–5 (2008). 40. Palacios G, Druce J, Du L et al.: A new

powerful technique for the identification and characterization of viral genomes. Virol. J. 2, 20 (2005). 29. Anderson NG, Gerin JL, Anderson NL:

Global screening for human viral pathogens. Emerg. Infect. Dis. 9(7), 768–774 (2003).

arenavirus in a cluster of fatal transplantassociated diseases. N. Engl. J. Med. 358(10), 991–998 (2008). 41. Lipkin WI: Pathogen discovery. PLoS Pathog.

4(4), e1000002 (2008). 42. Allander T, Jartti T, Gupta S et al.: Human

bocavirus and acute wheezing in children. Clin. Infect. Dis. 44(7), 904–910 (2007).

30. Dong J, Olano JP, McBride JW, Walker DH:

Emerging pathogens: challenges and successes of molecular diagnostics. J. Mol. Diagn. 10(3), 185–197 (2008). n

Comprehensive review of emerging and re-emerging pathogens and the role of molecular diagnostics.

43. Fredricks DN: Diseases of unknown etiology:

the role of infectious agents. In: Infectious Diseases. Cohen J, Powderly WG (Eds). Elsevier, London, UK (2004). n

31. Mahony JB, Blackhouse G, Babwah J et al.:

Cost analysis of multiplex PCR testing for diagnosing respiratory virus infections. J. Clin. Microbiol. 1, 1 (2009). 32. Rohayem J, Berger S, Juretzek T et al.:

A simple and rapid single-step multiplex RT-PCR to detect Norovirus, Astrovirus and Adenovirus in clinical stool samples. J. Virol. Methods 118(1), 49–59 (2004). 33. Roh KH, Kim J, Nam MH et al.:

Comparison of the Seeplex reverse transcription PCR assay with the R-mix viral culture and immunofluorescence techniques for detection of eight respiratory viruses. Ann. Clin. Lab. Sci. 38(1), 41–46 (2008). 34. Molenkamp R, van der Ham A, Schinkel J,

Beld M: Simultaneous detection of five different DNA targets by real-time Taqman PCR using the Roche LightCycler480: application in viral molecular diagnostics. J. Virol. Methods 141(2), 205–211 (2007). 35. Raymond F, Carbonneau J, Boucher N et al.:

Comparison of automated microarray detection with real-time PCR assays for detection of respiratory viruses in specimens obtained from children. J. Clin. Microbiol. 47(3), 743–750 (2009). Future Microbiol. (2010) 5(2)

Use of a microarray to detect and recover a sequence from a novel virus.

38. Palacios G, Quan PL, Jabado OJ et al.:

Detailed review of the role of electron microscopy in virus detection.

27. Doane FW: Immunoelectron microscopy in

biological weapons: the need for preventive action by the biomedical community. Nat. Genet. 29(3), 253–256 (2001). 18. van Aken J, Hammond E: Genetic

37. Wang D, Urisman A, Liu YT et al.: Viral

26. Roingeard P: Viral detection by electron

14. Quan PL, Palacios G, Jabado OJ et al.:

Detection of respiratory viruses and subtype identification of influenza A viruses by GreeneChipResp oligonucleotide microarray. J. Clin. Microbiol. 45(8), 2359–2364 (2007).

Microarray-based detection and genotyping of viral pathogens. Proc. Natl Acad. Sci. USA 99(24), 15687–15692 (2002).

25. Landry ML, Hsiung GD: Primary

13. Glaser CA, Honarmand S, Anderson LJ et al.:

Beyond viruses: clinical profiles and etiologies associated with encephalitis. Clin. Infect. Dis. 43(12), 1565–1577 (2006).

36. Wang D, Coscoy L, Zylberberg M et al.:

disease outbreaks using the internet. CMAJ 180(8), 829–831 (2009).

Comprehensive review of new infectious agents and their role in disease of unknown etiology.

44. Culley AI, Lang AS, Suttle CA: Metagenomic

ana­lysis of coastal RNA virus communities. Science 312(5781), 1795–1798 (2006). 45. Heeney JL: Zoonotic viral diseases and the

frontier of early diagnosis, control and prevention. J. Intern. Med. 260(5), 399–408 (2006). 46. Wolfe ND, Dunavan CP, Diamond J: Origins

of major human infectious diseases. Nature 447(7142), 279–283 (2007). n

Outlines the need for active surveillance against new zoonotic diseases.

47. Smith AW, Iversen PL, Skilling DE,

Stein DA, Bok K, Matson DO: Vesivirus viremia and seroprevalence in humans. J. Med. Virol. 78(5), 693–701 (2006). 48. Talbot SJ, Crawford DH: Viruses and

tumours – an update. Eur. J. Cancer 40(13), 1998–2005 (2004). 49. Fredricks DN, Relman DA: Infectious agents

and the etiology of chronic idiopathic diseases. Curr. Clin. Top. Infect. Dis. 18, 180–200 (1998). 50. Grigoriadis N, Hadjigeorgiou GM:

Virus-mediated autoimmunity in multiple sclerosis. J. Autoimmune Dis. 3, 1 (2006).

future science group

Metagenomics for the discovery of novel human viruses

51. Nejentsev S, Walker N, Riches D, Egholm M,

65. Kapikian AZ, Wyatt RG, Dolin R,

Todd JA: Rare variants of IFIH1, a gene implicated in antiviral responses, protect against Type 1 diabetes. Science 324(5925), 387–389 (2009).

Thornhill TS, Kalica AR, Chanock RM: Visualization by immune electron microscopy of a 27‑nm particle associated with acute infectious nonbacterial gastroenteritis. J. Virol. 10(5), 1075–1081 (1972).

52. Haverkos HW: Viruses, chemicals and

co-carcinogenesis. Oncogene 23(38), 6492–6499 (2004). 53. Virgin HW, Wherry EJ, Ahmed R:

Redefining chronic viral infection. Cell 138(1), 30–50 (2009). nn

Comprehensive review of the human virome and its role in human disease.

A previously unknown reovirus of bat origin is associated with an acute respiratory disease in humans. Proc. Natl Acad. Sci. USA 104(27), 11424–11429 (2007). 68. Jones MS, Lukashov VV, Ganac RD,

Schnurr DP: Discovery of a novel human picornavirus in a stool sample from a pediatric patient presenting with fever of unknown origin. J. Clin. Microbiol. 45(7), 2144–2150 (2007).

55. Morris KV, Rossi JJ: Lentiviral-mediated

delivery of siRNAs for antiviral therapy. Gene Ther. 13(6), 553–558 (2006). 56. Chiu CY, Alizadeh AA, Rouskin S et al.:

Diagnosis of a critical respiratory illness caused by human metapneumovirus by use of a pan-virus microarray. J. Clin. Microbiol. 45(7), 2340–2343 (2007). 57. Chiu CY, Rouskin S, Koshy A et al.:

Microarray detection of human parainfluenzavirus 4 infection associated with respiratory failure in an immunocompetent adult. Clin. Infect. Dis. 43(8), e71–e76 (2006).

69. Nichol ST, Spiropoulou CF, Morzunov S et al.:

Genetic identification of a hantavirus associated with an outbreak of acute respiratory illness. Science 262(5135), 914–917 (1993). 70. Chua KB, Goh KJ, Wong KT et al.: Fatal

encephalitis due to Nipah virus among pig-farmers in Malaysia. Lancet 354(9186), 1257–1259 (1999). 71. Dane DS, Cameron CH, Briggs M: Virus-like

particles in serum of patients with Australiaantigen-associated hepatitis. Lancet 1(7649), 695–698 (1970).

58. Urisman A, Molinaro RJ, Fischer N et al.:

Identification of a novel gretrovirus in prostate tumors of patients homozygous for R462Q RNASEL variant. PLoS Pathog. 2(3), e25 (2006). 59. Kim S, Kim N, Dong B et al.: Integration site

preference of xenotropic murine leukemia virus-related virus, a new human retrovirus associated with prostate cancer. J. Virol. 82(20), 9964–9977 (2008). 60. Fischer N, Hellwinkel O, Schulz C et al.:

Prevalence of human g-retrovirus XMRV in sporadic prostate cancer. J. Clin. Virol. 43(3), 277–283 (2008). 61. Ahmad F, Hudak PL, Bercovitz K,

Hollenberg E, Levinson W: Are physicians ready for patients with internet-based health information? J. Med. Internet Res. 8(3), e22 (2006). 62. Cann AJ: Virus Culture: A Practical Approach.

Oxford University Press, Oxford, UK (1999). 63. Sloots TP, Whiley DM, Lambert SB,

Nissen MD: Emerging respiratory agents: new viruses for old diseases? J. Clin. Virol. 42(3), 233–243 (2008). 64. Inglis TJ: Principia aetiologica: taking

causality beyond Koch’s postulates. J. Med. Microbiol. 56(Pt 11), 1419–1422 (2007).

future science group

RNA extraction from tissue and swab material. Methods Mol. Biol. 436, 13–18 (2008). 79. Loens K, Ursi D, Ieven M et al.: Detection of

Mycoplasma pneumoniae in spiked clinical samples by nucleic acid sequence-based amplification. J. Clin. Microbiol. 40(4), 1339–1345 (2002). 80. Zhang T, Breitbart M, Lee WH et al.: RNA

viral community in human feces: prevalence of plant pathogenic viruses. PLoS Biol. 4(1), e3 (2006).

67. Chua KB, Crameri G, Hyatt A et al.:

54. Woolhouse ME, Howey R, Gaunt E, Reilly L,

Chase-Topping M, Savill N: Temporal trends in the discovery of human viruses. Proc. Biol. Sci. 275(1647), 2111–2115 (2008).

78. Spackman E, Suarez DL: Avian influenza virus

66. Caul EO, Appleton H: The electron

microscopical and physical characteristics of small round human fecal viruses: an interim scheme for classification. J. Med. Virol. 9(4), 257–265 (1982).

72. Blumberg BS, Alter HJ, Visnich S: A ‘new’

antigen in leukemia sera. JAMA 191, 541–546 (1965). 73. Choo QL, Kuo G, Weiner AJ, Overby LR,

Bradley DW, Houghton M: Isolation of a cDNA clone derived from a blood-borne non-A, non-B viral hepatitis genome. Science 244(4902), 359–362 (1989). 74. Delwart EL: Viral metagenomics. Rev. Med.

Virol. 17(2), 115–131 (2007). nn

Comprehensive review of sequencing-based methods for virus discovery.

75. Petrich A, Mahony J, Chong S et al.:

Multicenter comparison of nucleic acid extraction methods for detection of severe acute respiratory syndrome coronavirus RNA in stool specimens. J. Clin. Microbiol. 44(8), 2681–2688 (2006). 76. Wiedbrauk DL, Werner JC, Drevon AM:

Inhibition of PCR by aqueous and vitreous fluids. J. Clin. Microbiol. 33(10), 2643–2646 (1995). 77. Xiang X, Qiu D, Hegele RD, Tan WC:

Comparison of different methods of total RNA extraction for viral detection in sputum. J. Virol. Methods 94(1–2), 129–135 (2001).

www.futuremedicine.com

Review

81. Thurber RV, Haynes M, Breitbart M,

Wegley L, Rohwer F: Laboratory procedures to generate viral metagenomes. Nat. Protoc. 4(4), 470–483 (2009). n

Description of contemporary methods to process samples for viral metagenomic studies.

82. Allander T, Emerson SU, Engle RE,

Purcell RH, Bukh J: A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species. Proc. Natl Acad. Sci. USA 98(20), 11609–11614 (2001). 83. Xu M, Chan Y, Fischer SH, Remaley AT:

Automated procedure for improving the RNA isolation step in viral load testing for human immunodeficiency virus. J. Clin. Microbiol. 42(1), 439–440 (2004). 84. Knepp JH, Geahr MA, Forman MS,

Valsamakis A: Comparison of automated and manual nucleic acid extraction methods for detection of enterovirus RNA. J. Clin. Microbiol. 41(8), 3532–3536 (2003). 85. Hu Y, Hirshfield I: Rapid approach to

identify an unrecognized viral agent. J. Virol. Methods 127(1), 80–86 (2005). 86. Kellam P: Molecular identification of novel

viruses. Trends Microbiol. 6(4), 160–165 (1998). 87. Poon LL, Chu DK, Chan KH et al.:

Identification of a novel coronavirus in bats. J. Virol. 79(4), 2001–2009 (2005). 88. Oberste MS, Maher K, Nix WA et al.:

Molecular identification of 13 new enterovirus types, EV79–88, EV97, and EV100–101, members of the species Human Enterovirus B. Virus Res. 128(1–2), 34–42 (2007). 89. Junttila N, Leveque N, Kabue JP et al.: New

enteroviruses, EV-93 and EV-94, associated with acute flaccid paralysis in the Democratic Republic of the Congo. J. Med. Virol. 79(4), 393–400 (2007). 90. Rose TM, Strand KB, Schultz ER et al.:

Identification of two homologs of the Kaposi’s sarcoma-associated herpesvirus (human herpesvirus 8) in retroperitoneal fibromatosis of different macaque species. J. Virol. 71(5), 4138–4144 (1997).

187

Review

Tang & Chiu

91. Drexler JF, Grywna K, Stocker A et al.:

103. Chang Y, Cesarman E, Pessin MS et al.:

Novel human parechovirus from Brazil. Emerg. Infect. Dis. 15(2), 310–313 (2009). 92. Lamson D, Renwick N, Kapoor V et al.:

MassTag polymerase-chain-reaction detection of respiratory pathogens, including a new rhinovirus genotype, that caused influenzalike illness in New York State during 2004–2005. J. Infect. Dis. 194(10), 1398–1402 (2006).

Tracking the evolution of the SARS coronavirus using high-throughput, high-density resequencing arrays. Genome Res. 14(3), 398–405 (2004). Using a resequencing microarray as a multiple respiratory pathogen detection assay. J. Clin. Microbiol. 45(2), 443–452 (2007).

96. Dominguez SR, Briese T, Palacios G et al.:

Multiplex MassTag-PCR for respiratory pathogens in pediatric nasopharyngeal washes negative by conventional diagnostic testing shows a high prevalence of viruses belonging to a newly recognized rhinovirus clade. J. Clin. Virol. 43(2), 219–222 (2008).

spectrum respiratory tract pathogen identification using resequencing DNA microarrays. Genome Res. 16(4), 527–535 (2006).

98. Stein J, Liang P: Differential display

Comprehensive viral oligonucleotide probe design using conserved protein regions. Nucleic Acids Res. 36(1), e3 (2008).

Comparison of differential display techniques.

Identification of cardioviruses related to Theiler’s murine encephalomyelitis virus in human infections. Proc. Natl Acad. Sci. USA 105(37), 14124–14129 (2008). 109. Kistler AL, Gancz A, Clubb S et al.: Recovery

of divergent avian bornaviruses from cases of proventricular dilatation disease: identification of a candidate etiologic agent. Virol. J. 5, 88 (2008). 110. Nygaard V, Hovig E: Options available for

profiling small samples: a review of sample amplification technology when combined with microarray profiling. Nucleic Acids Res. 34(3), 996–1014 (2006). 111. Breitbart M, Rohwer F: Method for

discovering novel DNA viruses in blood using viral particle selection and shotgun sequencing. Biotechniques 39(5), 729–736 (2005). 112. Kistler A, Avila PC, Rouskin S et al.:

Mushahwar IK: Amplification and subtraction methods and their application to the discovery of novel human viruses. J. Med. Virol. 53(1), 96–103 (1997).

Pan-viral screening of respiratory tract infections in adults with and without asthma reveals unexpected human coronavirus and human rhinovirus diversity. J. Infect. Dis. 196(6), 817–825 (2007).

100. Sagerstrom CG, Sun BI, Sive HL: Subtractive

113. Leski TA, Malanoski AP, Stenger DA, Lin B:

99. Muerhoff AS, Leary TP, Desai SM,

Target amplification for broad spectrum microbial diagnostics and detection. Future Microbiol. 5(2), 191–203 (2010).

cloning: past, present, and future. Annu. Rev. Biochem. 66, 751–783 (1997). 101. Ambrose HE, Clewley JP: Virus discovery by

sequence-independent genome amplification. Rev. Med. Virol. 16(6), 365–383 (2006).

114. Urisman A, Fischer KF, Chiu CY et al.:

E-Predict: a computational strategy for species identification based on observed DNA microarray hybridization patterns. Genome Biol. 6(9), R78 (2005).

102. Lu Y, Wang SY, Lotz JM: The use of

differential display to isolate viral genomic sequence for rapid development of PCR-based detection methods. A test case using Taura syndrome virus. J. Virol. Methods 121(1), 107–114 (2004).

188

Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science 300(5624), 1394–1399 (2003). 118. Mihindukulasuriya KA, Wu G, St Leger J,

Nordhausen RW, Wang D: Identification of a novel coronavirus from a beluga whale by using a panviral microarray. J. Virol. 82(10), 5084–5088 (2008). 119. Jones MS, Kapoor A, Lukashov VV,

Simmonds P, Hecht F, Delwart E: New DNA viruses identified in patients with acute viral infection syndrome. J. Virol. 79(13), 8230–8236 (2005).

108. Chiu CY, Greninger AL, Kanada K et al.:

technology: a general guide. Cell. Mol. Life Sci. 59(8), 1235–1240 (2002). n

117. Rota PA, Oberste MS, Monroe SS et al.:

107. Jabado OJ, Liu Y, Conlan S et al.:

97. Sampath R, Russell KL, Massire C et al.:

Global surveillance of emerging influenza virus genotypes by mass spectrometry. PLoS One 2(5), e489 (2007).

King DP, Britton P: DetectiV: visualization, normalization and significance testing for pathogen-detection microarray data. Genome Biol. 8(9), R190 (2007).

106. Lin B, Wang Z, Vora GJ et al.: Broad-

95. Mahony J, Chong S, Merante F et al.:

Development of a respiratory virus panel test for detection of twenty human respiratory viruses by use of multiplex PCR and a fluid microbead-based assay. J. Clin. Microbiol. 45(9), 2965–2970 (2007).

116. Watson M, Dukes J, Abu-Median AB,

105. Lin B, Blaney KM, Malanoski AP et al.:

94. Boyce R, Chilana P, Rose TM: iCODEHOP:

a new interactive program for designing consensus-degenerate hybrid oligonucleotide primers from multiply aligned protein sequences. Nucleic Acids Res. 37(Web Server issue), W222–W228 (2009).

Schlapbach R: PhyloDetect: a likelihoodbased strategy for detecting microorganisms with diagnostic microarrays. Bioinformatics 24(16), i83–i89 (2008).

104. Wong CW, Albert TJ, Vega VB et al.:

93. Jabado OJ, Palacios G, Kapoor V et al.:

Greene SCPrimer: a rapid comprehensive tool for designing degenerate primers from multiple sequence alignments. Nucleic Acids Res. 34(22), 6605–6611 (2006).

115. Rehrauer H, Schonmann S, Eberl L,

Identification of herpesvirus-like DNA sequences in AIDS-associated Kaposi’s sarcoma. Science 266(5192), 1865–1869 (1994).

n

Description of a computational algorithm for automated microarray hybridization pattern detection.

Future Microbiol. (2010) 5(2)

n

Demonstration of the use of sequencingindependent methods to efficiently identify a novel virus.

120. van den Hoogen BG, de Jong JC, Groen J

et al.: A newly discovered human pneumovirus isolated from young children with respiratory tract disease. Nat. Med. 7(6), 719–724 (2001). 121. Allander T, Tammi MT, Eriksson M,

Bjerkner A, Tiveljung-Lindell A, Andersson B: Cloning of a human parvovirus by molecular screening of respiratory tract samples. Proc. Natl Acad. Sci. USA 102(36), 12891–12896 (2005). 122. Allander T, Andreasson K, Gupta S et al.:

Identification of a third human polyomavirus. J. Virol. 81(8), 4130–4136 (2007). 123. Gaynor AM, Nissen MD, Whiley DM et al.:

Identification of a novel polyomavirus from patients with acute respiratory tract infections. PLoS Pathog. 3(5), e64 (2007). 124. Jones MS 2nd, Harrach B, Ganac RD et al.:

New adenovirus species found in a patient presenting with gastroenteritis. J. Virol. 81(11), 5978–5984 (2007). 125. Kapoor A, Slikas E, Simmonds P et al.:

A newly identified bocavirus species in human stool. J. Infect. Dis. 199(2), 196–200 (2009). 126. Kapoor A, Victoria J, Simmonds P et al.: A

highly prevalent and genetically diversified Picornaviridae genus in South Asian children. Proc. Natl Acad. Sci. USA 105(51), 20482–20487 (2008).

future science group

Metagenomics for the discovery of novel human viruses

127. Holtz LR, Finkbeiner SR, Zhao G et al.:

Klassevirus 1, a previously undescribed member of the family Picornaviridae, is globally widespread. Virol. J. 6, 86 (2009). 128. Li L, Victoria J, Kapoor A et al.: Genomic

characterization of novel human parechovirus type. Emerg. Infect. Dis. 15(2), 288–291 (2009). 129. Ansorge WJ: Next-generation DNA

sequencing techniques. N. Biotechnol. 25(4), 195–203 (2009). nn

Review of second- and third-generation DNA sequencing platforms.

130. Nakamura S, Yang CS, Sakon N et al.: Direct

metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach. PLoS ONE 4(1), e4219 (2009). 131. Sorber K, Chiu C, Webster D et al.: The long

march: a sample preparation technique that enhances contig length and coverage by high-throughput short-read sequencing. PLoS ONE 3(10), e3495 (2008). 132. Feng H, Shuda M, Chang Y, Moore PS:

Clonal integration of a polyomavirus in human Merkel cell carcinoma. Science 319(5866), 1096–1100 (2008). 133. Pettersson E, Lundeberg J, Ahmadian A:

Generations of sequencing technologies. Genomics 93(2), 105–111 (2009). 134. Briese T, Paweska JT, McMullan LK et al.:

Genetic detection and characterization of Lujo virus, a new hemorrhagic feverassociated arenavirus from southern Africa. PLoS Pathog. 5(5), e1000455 (2009). 135. Greninger AL, Runckel C, Chiu CY et al.:

The complete genome of klassevirus – a novel picornavirus in pediatric stool. Virol. J. 6, 82 (2009).

future science group

136. Erlich Y, Chang K, Gordon A et al.:

147. Bergeron MG: Revolutionizing the practice of

DNA Sudoku – harnessing high-throughput sequencing for multiplexed specimen ana­lysis. Genome Res. 19(7), 1243–1253 (2009). 137. Craig DW, Pearson JV, Szelinger S et al.:

Identification of genetic variants using bar-coded multiplexed sequencing. Nat. Methods 5(10), 887–893 (2008).

medicine through rapid (< 1h) DNA-based diagnostics. Clin. Invest. Med. 31(5), E265–E271 (2008). 148. Harris TD, Buzby PR, Babcock H et al.:

Single-molecule DNA sequencing of a viral genome. Science 320(5872), 106–109 (2008). n

138. Zhang Z, Schaffer AA, Miller W et al.:

Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res. 26(17), 3986–3990 (1998).

Soh HT: Multitarget dielectrophoresis activated cell sorter. Anal. Chem. 80(22), 8656–8661 (2008). 150. Marziali A, Pel J, Bizzotto D, Whitehead LA:

Novel electrophoresis mechanism based on synchronous alternating drag perturbation. Electrophoresis 26(1), 82–90 (2005).

140. Rivers TM: Viruses and Koch’s postulates.

J. Bacteriol. 33(1), 1–12 (1937). 141. Salvetti M, Giovannoni G, Aloisi F:

151. Relman DA: The search for unrecognized

Epstein–Barr virus and multiple sclerosis. Curr. Opin. Neurol. 22(3), 201–206 (2009). 142. Gewurz BE, Marty FM, Baden LR, Katz JT:

Human herpesvirus 6 encephalitis. Curr. Infect. Dis. Rep. 10(4), 292–299 (2008).

pathogens. Science 284(5418), 1308–1310 (1999).

Affiliations n

143. Okamoto H: History of discoveries and

pathogenicity of TT viruses. Curr. Top. Microbiol. Immunol. 331, 1–20 (2009). 144. Dong B, Kim S, Hong S et al.: An infectious

retrovirus susceptible to an IFN antiviral pathway from human prostate tumors. Proc. Natl Acad. Sci. USA 104(5), 1655–1660 (2007). 145. Holland CA, Kiechle FL: Point-of-care

molecular diagnostic systems – past, present and future. Curr. Opin. Microbiol. 8(5), 504–509 (2005). 146. Jain KK: Nanotechnology in clinical

laboratory diagnostics. Clin. Chim. Acta 358(1–2), 37–54 (2005).

www.futuremedicine.com

Demonstration of single-molecule DNA sequencing of a virus.

149. Kim U, Qian J, Kenrick SA, Daugherty PS,

139. Falkow S: Molecular Koch’s postulates

applied to microbial pathogenicity. Rev. Infect. Dis. 10(Suppl. 2), S274–S276 (1988).

Review

n

Patrick Tang British Columbia Centre for Disease Control, Department of Pathology & Laboratory Medicine, University of British Columbia, 655 West 12th Avenue, Vancouver, BC, V5Z 4R4, Canada Tel.: +1 604 707 2616 Fax: +1 604 707 2675 [email protected] Charles Chiu Departments of Laboratory Medicine & Medicine (Infectious Diseases), University of California, San Francisco, 185 Berry Street, Box 0134, San Francisco, CA 94143-0134, USA Tel.: +1 415 514 8219 Fax: +1 415 514 8242 [email protected]

189