Comparing the performance of multiple ... - Project Penguin

2 downloads 0 Views 1MB Size Report
example, if the IUCN red listed Oxleyan pygmy perch Nannoperca oxleyana Whit- ley was .... Multiple representatives of the Australian smelt Retropinna semoni.
Journal of Fish Biology (2010) 77, 2093–2122 doi:10.1111/j.1095-8649.2010.02821.x, available online at wileyonlinelibrary.com

Comparing the performance of multiple mitochondrial genes in the analysis of Australian freshwater fishes T. J. Page* and J. M. Hughes Australian Rivers Institute, Griffith University, Nathan, Queensland 4111, Australia (Received 23 April 2010, Accepted 25 September 2010) In this study, four mitochondrial genes (cytochrome oxidase I, ATPase, cytochrome b and control region) were amplified from most of the fish species found in the fresh waters of south-eastern Queensland, Australia. The performance of these different gene regions was compared in terms of their ability to cluster fish families together in a neighbour-joining tree, both individually by gene and in all combinations. The relative divergence rates of each of these genes were also calculated. The three coding genes (cytochrome oxidase I, ATPase and cytochrome b) recovered similar number of families and had broadly similar divergence rates. ATPase diverged a little more quickly than cytochrome oxidase I and cytochrome b slightly more slowly than cytochrome oxidase I. All twogene combinations recovered the same number of families. Results from the control region were much more variable, and, although generally possessing more diversity than the other regions, were © 2010 The Authors sometimes less variable. Journal of Fish Biology © 2010 The Fisheries Society of the British Isles

Key words: ATP; COI; control region; cytochrome b; DNA barcoding; Queensland.

INTRODUCTION There is a long tradition of using genetic information in fisheries science and management dating back to the 1950s, not only in the investigation of stock structures (Kochzius, 2009) but even in the detection of cheating in fishing competitions (Primmer et al., 2000). The use of molecular data has continued unabated as it has become cheaper and highly automated, and these data are now widely used as a part of research in fish ecology, systematics and conservation (Hauser & Seeb, 2008). Although long employed in this role (Hamilton & Wheeler, 2008), a great deal of attention has recently been focused on using molecular data for the identification of fish species (Ward et al., 2009). While this may not seem a major issue for most adult specimens, the accurate identification of larvae, eggs, fillets, fin clips and unfamiliar exotic invasive species (Mather & Arthington, 1991) can indeed be challenging, even for an experienced ichthyologist (Kochzius, 2009; Teletchea, 2009). Yet, the ability to assign an individual to a species is vital for ecological research (Turner, 1999), *Author to whom correspondence should be addressed. Tel.: +61 7 37357418; email: [email protected]

2093 © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles

2094

T. J . PA G E A N D J . M . H U G H E S

biosecurity (Ferri et al., 2009) and conservation, given that environmental protection is rarely conferred upon taxonomic units other than species (Turner, 1999). Therefore, if a study was planned to document the fish biodiversity of a geographic area, what kind of molecular data should be used, given that genetic methods, like fishes themselves, are highly diverse (Hauser & Seeb, 2008)? Microsatellites have been widely used in the definition of stock structures, but usually need to be developed separately for each species and so are not widely transferable between all fish species in an area (Kochzius, 2009). On the other hand, mitochondrial gene sequencing has proven very popular as these sequences are easily obtained across many species (Galtier et al., 2009) and possess enough information to differentiate between species and divergent populations within a species (evolutionarily significant units, ESU) (Vrijenhoek, 1998). There are numerous mitochondrial (mt) genes and regions, so which should be chosen? Cytochrome b (cytb) has traditionally been the most commonly used mitochondrial gene in fish studies, particularly for phylogenetics and phylogeography (Meyer, 1994; Teletchea, 2009), and thus it (and the nuclear gene rhodopsin) is the target region for the large-scale European fish identification project FishTrace (www.fishtrace.org) (Sevilla et al., 2007). More recently, the 5 -end of the cytochrome c oxidase subunit 1 (COI or COX1) has come to the fore (Galtier et al., 2009), riding the wave of enthusiasm for ‘DNA barcoding’ (Hebert et al., 2003; Ward et al., 2009). The basic tenet of barcoding is the selection of a single gene fragment to identify all described animal species and to aid in the discovery of new species (Hebert et al., 2003). There has been much debate, some of it rancorous, over the strengths and weaknesses of barcoding in particular and of mitochondrial genes in general (Rubinoff & Holland, 2005; Frezal & Leblois, 2008; Galtier et al., 2009). Despite this, many regional fish barcoding projects are progressing, producing lots of potentially useful COI data (Ward et al., 2005; Hubert et al., 2008; Ariagna et al., 2010), especially given that fishes are an integral part of the Barcoding for Life Data Systems (BOLD; www.boldsystems.org) Ratnasingham & Hebert, 2007) in the Fish Barcode of Life Initiative (FISH-BOL; www.fish-bol.org; Ward et al., 2009). In addition to the above two mitochondrial genes, there are numerous others. The question remains, which one to use? Meyer (1994) suggested keeping an open mind in considering potential markers. The choice may well be influenced by the extent of existing data for the taxa of interest, as online databases (e.g. GenBank, BOLD) hold a great number of publicly available DNA sequences. There is no point in reinventing the wheel if someone has already sequenced the same species from the same area. Existing and new sequence data, however, can only be integrated if the same gene fragment has been used. The large number of fish phylogeography and phylogeny projects over the years has resulted in a great deal of potentially usable data (Hauser & Seeb, 2008). In the case of rare and endangered species, it may not be possible to obtain permission to resample and resequence many more individuals, and hence a researcher may be forced to use whatever gene was used in previous studies. For example, if the IUCN red listed Oxleyan pygmy perch Nannoperca oxleyana Whitley was of interest, the mitochondrial control region would need to be sequenced, as Knight et al. (2009) did so as to align with existing data (Hughes et al., 1999), whether or not the control region was judged to be the best fragment for this purpose. Alternatively, if interest was in some Australian eleotrid species, gene choice may need to be tailored based on the species of interest, e.g. for Mogurnda spp., © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2095

either ATPase (Hurwood & Hughes, 1998) or NADH4 (M. Adams, pers. comm.); for most Hypseleotris species, cytb (Thacker et al., 2007) but for the empire gudgeon Hypseleotris compressa (Krefft), ATPase (McGlashan & Hughes, 2001). This sort of confusion might argue for the use a single mitochondrial fragment for fishes, as suggested by DNA barcoding (Ward et al., 2009). Nevertheless, the question remains as to whether it really matters, as all mitochondrial genes share the same history (Ballard & Whitlock, 2004). Would the various fragments do the same jobs equally well? Some fragments, such as the control region and ATPase, are thought to diverge more quickly than COI or cytb (Meyer, 1994). Might some genes be better for deeper relationships than others (Miya & Nishida, 2000)? Could the divergence between species from one gene be used to predict the possible divergence for another gene from a different species, thus making the comparison of published data sets from sympatric species more relevant? In an attempt to answer some of these questions, this study considered the genetic diversity of the freshwater fish fauna of south-eastern Queensland, Australia (Fig. 1) by comparing the performance of a number of different mitochondrial genes in terms of phylogenetic information content and relative divergence. A better understanding of the differing rates of divergence levels of various markers will mean that studies on different species and genes can more easily be integrated for a total evidence approach. There have been fewer barcoding studies on freshwater fishes than on marine species (Hubert et al., 2008; Ward et al., 2009), and yet freshwater fishes show a higher level of differentiation between populations because of the constraints of their landscape (Ward et al., 1994). This isolation can eventually lead to speciation, and thus the potential for unappreciated levels of biodiversity in the form of cryptic species is high for freshwater fishes (Ward et al., 2009), as has been proven in the case in Australia (Page et al., 2004; Hammer et al., 2007). Freshwater fishes are currently under threat from large-scale dam projects, invasive species, pollution and human population growth (Vrijenhoek, 1998; Leveque et al., 2008), all of which are very evident in south-eastern Queensland (Page et al., 2004; Olden et al., 2008), and thus a thorough knowledge of the region’s existing biodiversity in the form of a genetic database is important (Ferri et al., 2009; Ward et al., 2009).

MATERIALS AND METHODS O N L I N E D ATA B A S E S E A R C H E S Online databases were searched to aid in the selection of appropriate mitochondrial gene regions to be targeted. ISI Web of Knowledge Current Contents Connect (www. isiwebofknowledge.com) was searched for Journal of Fish Biology papers from 2000 to 2010 inclusive on 19 March 2010 with the topic search terms ‘fish’ and ‘phylogeograph*’or ‘fish’ and ‘phylogen*’. Only papers including mitochondrial sequence data were retained [e.g. no nuclear-only data and restriction Fragment Length Polymorphism (RFLP)]. The European Molecular Biology Laboratory (EMBL) online sequence database was searched on 24 March 2010 using the SRS query facility (http://srs.ebi.ac.uk) by selecting all nucleotide sequence databases and sub-sections (which includes GenBank and the DNA Data Bank of Japan) for ray-finned fishes (search term ‘Actinopterygii’) and with the following four separate groups of search terms (| = or): (1) ‘COI | CO1 | COX1 | cytochrome oxidase 1 | cytochrome oxidase I | cytochrome c oxidase subunit I | cytochrome c oxidase subunit 1’, (2) ‘CytB | cytochrome B’, (3) ‘ATP 6 | ATP6 | ATPase 6 | ATPase subunit 6 | © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

2096

T. J . PA G E A N D J . M . H U G H E S

South-east Queensland 25°

Tin Can Bay 26°

Mary Noosa

Maroochy Mooloolah Glasshouse Mountains 27° Bribie Island

Caboolture Brisbane

Moreton Island

Pine

Stradbroke Island LoganAlbert 0

Gold Coast

28°

50 km

152°

153°

Fig. 1. Map of south-east Queensland, Australia, showing boundaries of river basins.

ATP synthase 6’ (and the same searches with ‘8’ in place of ‘6’) and (4) ‘D-loop | control region’. The Barcoding of Life (BOLD) online database (Ratnasingham & Hebert, 2007) was searched on the same day for all COI records for Actinopterygii.

S A M P L I N G S T R AT E G Y The aim was to include sequences from as many species of fishes as possible, likely to be encountered in the fresh waters of south-eastern Queensland, Australia (Fig. 1), either from specimens sampled for this project or from the many fish mitochondrial genomes now freely available (e.g. http://mitofish.ori.u-tokyo.ac.jp; Table I). This would allow direct comparisons between gene regions from the same individual specimens. Included were 22 of the 26 freshwater species native to south-eastern Queensland (Pusey et al., 2004), nine amphidromous or catadromous species native to the area that are often sampled in fresh waters (Pusey et al., 2004; FishBase: www.fishbase.org), one Australian species translocated into the area and eight non-Australian exotic species introduced into the area (Pusey et al., 2004). Fishes were captured with a seine, dip-net or baited box trap, and identified in the field for fin clipped individuals and in the laboratory for whole fishes, and were preserved when necessary in 95% ethanol or liquid nitrogen. Multiple representatives of the Australian smelt Retropinna semoni (Weber) (Hammer et al., 2007) and ornate rainbowfish Rhadinocentrus ornatus Regan (Page et al., 2004) were included, as these are likely to harbour cryptic species. Species not found © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Ceratodontidae Neoceratodus forsteri *

Hypoatherina tsurugae Centropomidae Lates calcarifer*

Craterocephalus stercusmuscarum* Craterocephalus stramineus†

Atherinidae Craterocephalus marjorae*

Anguilla reinhardtii *

Anguilla australis*

Anguillidae Anguilla anguilla

Taxon

GenBank – Queensland, Australia

GenBank – Singapore Aquaculture

Upper Canungra Creek, Gold Coast, Queensland, Australia Caboolture Creek, Queensland, Australia Nicholson River at Adel’s Grove, Queensland, Australia GenBank – north-west Pacific

GenBank – Burnett River, Queensland, Australia Brisbane River at Fernvale, Queensland, Australia

GenBank, France

Collection site

HM006997

HM006955

AF302933

DQ010541

AF302933

DQ010541

AP004420

HM006996

HM006954

AP004420

HM007034

AP007234

AP007233

cytb

AP007234

AP007233

CR Minegishi et al. (2005) Minegishi et al. (2005) This study

Reference

AF302933

DQ010541

AP004420

HM007037

HM007036

AF302933

DQ010541

AP004420

Brinkmann et al. (2004)

Lin et al. (2006)

Miya et al. (2003)

HM006921 This study

HM006920 This study

HM0069952 HM0070356 HM006919 This study

HM006994

HM0069521

HM006953

AP007234

AP007233

ATP

AP007234

AP007233

COI

Table I. Fish specimens, locations and sequence information

N/A

N/A

N/A

GU-KR140

GU-KR063

GU-AW452

GU-AW325

N/A

N/A

Specimen number

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2097

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Paratilapia polleni

Oreochromis ‘sp. TP’‡C

Oreochromis mossambicus‡

Hypselecara temporalis Oreochromis ‘sp. KM’B

Cichlidae Geophagus sp.‡

Ambassis ‘sp. North-west’†A

Ambassis marianus*

Chandidae Ambassis agassizi *

Taxon

Lake Samsonvale (Pine), Queensland, Australia. (introduction) Brisbane River, Queensland, Australia. (introduction) GenBank – origin Madagascar

Blackrock Creek (Brisbane River), Queensland, Australia (introduction) GenBank – origin South America GenBank – origin Africa

Keyhole Lagoon, Stradbroke Island, Queensland, Australia Albert River, Queensland, Australia Cooper Creek at Merken Waterhole, Queensland, Australia

Collection site

HM006993

HM006951

AP009508

HM006976

AP009508

HM007017

AP009126

AP009126

HM006977

AP009506

AP009506

HM006999

HM006992

HM0069501

HM006957

HM0069912

ATP

HM0069491

COI

Table I. Continued

AP009508

HM006939

HM0070586

AP009508

HM0069408

AP009126

AP009506

HM006923

HM007059

AP009126

AP009506

HM007039

Azuma et al. (2008)

This study

Azuma et al. (2008) Mabuchi et al. (2007) This study

This study

This study

HM0070336

This study

Reference

This study HM006918

HM006917

CR

HM007032

HM007031

cytb

N/A

GU-AW421

GU-F.21

N/A

N/A

GU-AW303

GU-2269

GU-2330

GU-2252

Specimen number

2098 T. J . PA G E A N D J . M . H U G H E S

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

GenBank – north-east Asia

GenBank – origin Eurasia GenBank – Eurasia GenBank – Asia

Carassius cuvieri

Cyprinus carpio‡ Xenocypris argentea Xenocypris davidi

GenBank – origin Asia

Barambah Creek (Burnett River), Queensland, Australia

Nematalosa erebi *

Cyprinidae Carassius auratus‡

GenBank – western Pacific

GenBank – north-eastern North America GenBank – North and Central America GenBank – West Africa

Collection site

Nematalosa japonica

Ethmalosa fimbriata

Clupeidae Dorosoma cepedianum Dorosoma petenense

Taxon

X61010 AP009059 NC013072

AB045144

AB111951

HM006973

AP009142

X61010 AP009059 NC013072

AB045144

AB111951

HM007014

AP009142

AP009138

AP009136

AP009136 AP009138

DQ536426

ATP

DQ536426

COI

Table I. Continued

X61010 AP009059 NC013072

AB045144

AB111951

HM007055

AP009142

AP009138

AP009136

DQ536426

cytb

X61010 AP009059 NC013072

AB045144

AB111951

AP009142

AP009138

AP009136

CR

M. Murakami, Y. Takase & H. Fujtani (unpubl. data) M. Murakami (unpubl. data) Chang et al. (1994) Saitoh et al. (2006) S. Liu, C. You & Y. Chen (unpubl. data)

Broughton & Reneau (2006) Lavoue et al. (2007) Lavoue et al. (2007) Lavoue et al. (2007) This study

Reference

N/A N/A N/A

N/A

N/A

GU-AW304

N/A

N/A

N/A

N/A

Specimen number

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2099

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Mogurnda mogurnda†

Hypseleotris klunzingeri * Hypseleotris ‘sp. Midgley’s’*D Mogurnda adspersa*

Hypseleotris galii *

Hypseleotris compressa*

Gobiomorphus australis*

Eleotridae Eleotris acanthopoma† Gobiomorphus coxii *

Taxon

HM0070004 HM0070025

HM007003

HM0069501 HM0069601 HM0069611

HM007010

HM0070095

HM0069681

HM006969

HM007045

HM0070045

HM0069631

HM007051

HM007050

HM007044

HM0069621

HM007043

HM007042

HM007040

HM007041

HM0070013

HM006959

Allyn River (Hunter River), New South Wales, Australia Blue Lake Creek, Stradbroke Island, Queensland, Australia Alligator Creek, Fraser Island, Queensland, Australia Blue Lake Creek, Stradbroke Island, Queensland, Australia Tinana Creek, Mary, Queensland, Australia Manilla River, Namoi, New South Wales, Australia 18 Mile Swamp, Stradbroke Island, Queensland, Australia Copperfield Creek (Daly River), North Territory Australia

cytb AP004455

ATP AP004455

AP004455

COI

GenBank – western Pacific

Collection site

Table I. Continued

HM006934

HM006933

HM006929

HM006928

HM006927

HM006926

HM006924

HM006925

AP004455

CR

This study

This study

This study

This study

This study

This study

This study

Miya et al. (2003) This study

Reference

GU-Mog94

GU-2095

GU-2110B

GU-2068

GU-2033

GU-2050

GU-2209

GU-F.01

N/A

Specimen number

2100 T. J . PA G E A N D J . M . H U G H E S

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Galaxiella nigrostriata†

Galaxiidae Galaxias maculatus*

Philypnodon macrostomus*

Philypnodon grandiceps*

Taxon

GenBank – Australasia, South America GenBank – south-western Australia

Coomera River, Gold Coast, Queensland, Australia Stoney Creek (Brisbane River), Queensland, Australia

Collection site

AP006853

AP004104 AP006853

AP004104 AP006853

AP004104

HM007061

HM0070193

HM006979

HM007060

HM0070182

HM0069781

cytb

ATP

COI

Table I. Continued

AP006853

AP004104

HM006941

CR

Ishiguro et al. (2003) M. Miya, T. P. Satoh, Y. Yamanoue, K. Mabuchi, S. M. Shirai, N. Yagashita, K. Nakayama, H. Takeshima, N. J. Suzuki, J. G. Inoue, N. B. Ishiguro, Y. Azuma, A. Kawaguchi, T. Mukai, H. Sakurai, H. Endo & M. Nishida (unpubl. data)

This study

This study

Reference

N/A

N/A

GU-KR144

GU-2427

Specimen number C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2101

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Mugil cephalus*

Mugilidae Mugil cephalus*

R. ornatus ‘SER’*E

Rhadinocentrus ornatus ‘CEQ’*E R. ornatus ‘SEQ’*E

Megalopidae Megalops atlanticus Megalops cyprinoides* Melanotaeniidae Melanotaenia duboulayi * Melanotaenia lacustris Melanotaenia fluviatilis† Melanotaenia splendida†

Taxon

This study This study

HM0070265 HM007068

HM0069871 HM0070272 HM007069

North Pine River, Queensland, Australia GenBank – Indo-Pacific

HM006932 This study

HM0070084 HM007049

HM006967

NC003170

NC003170

NC003170

HM007052

HM0070285 HM0070706

HM0069701 HM007011

HM006988

HM006986

HM006931 This study

HM007048

HM007007

HM006966

AP004419

AP004419

NC003170

AP004419

GU-AW418

GU-2194

GU-NR3

GU-2285

GU-KR026

GU-CSIRO2

N/A

GU-R.087

N/A N/A

Specimen number

Miya et al. (2001)§ N/A

This study

This study

Miya et al. (2003)

HM006930 This study

AP004419

Inoue et al. (2004) Inoue et al. (2004)

Reference

HM0070062 HM007047

AB051110

CR

HM006965

AP004808 AB051110

cytb

Coondoo Creek, Mary, Queensland, Australia GenBank – Papua New Guinea Mildura Weir Pool, Murray, Victoria, Australia Catfish Creek (Calliope River), Queensland, Australia Rocky Creek, Fraser Island, Queensland, Australia Little Canalpin Creek, Stradbroke Island, Queensland, Australia Searys Creek, Tin Can Bay, Queensland, Australia

AP004808 AB051110

ATP

AP004808 AB051110

COI

GenBank – Atlantic GenBank – Indo-Pacific

Collection site

Table I. Continued

2102 T. J . PA G E A N D J . M . H U G H E S

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Tandanus tandanus*

Porochilus rendalhi *

Plotosidae Neosilurus hyrtlii *

Percichthyidae Nannoperca australis† Nannoperca oxleyana*

Scleropages leichardti

Osteoglossum bicirrhosum Scleropages formosus

Osteoglossidae Arapaima gigas

Taxon

Obi Obi Creek, Mary, Queensland, Australia Keyhole Lagoon, Stradbroke Island, Queensland, Australia Blue Lake Creek, Stradbroke Island, Queensland, Australia

Goulburn River, Victoria, Australia 18 Mile Swamp, Stradbroke Island, Queensland, Australia

GenBank – origin South America GenBank – Singapore fish farm Lake Atkinson (Brisbane River), Queensland, Australia. (introduction)

GenBank – Brazil

Collection site

DQ023143

AB043025

EF523611

CR

HM007072 HM006948

HM0070204 HM0070303

HM0069801

HM006990

HM007062 HM006942

HM007015

HM007056 HM006937

HM007054 HM006936

HM007053 HM006935

HM007071

DQ023143

AB043025

EF523611

cytb

HM006974

HM007013

HM006972

HM0070292

HM0069891

HM007012

DQ023143

DQ023143

HM006971

AB043025

EF523611

ATP

AB043025

EF523611

COI

Table I. Continued Specimen number

This study

This study

This study

This study

This study

This study

Yue et al. (2006)

GU-2114

GU-2120

GU-AW562

GU-2145

GU-PP51

GU-AW401

N/A

Hrbek & Farias N/A (2008) Inoue et al. (2001) N/A

Reference

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2103

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Retropinna semoni ‘SEQ’*F

Retropinna semoni ‘SEC’†F Retropinna semoni ‘SEQ’*F

Pseudomugil signifer* Retropinnidae Retropinna retropinna

Xiphophorus maculatus‡ Pseudomugilidae Pseudomugil mellis*

Xiphophorus hellerii ‡

Gambusia holbrooki ‡

Poeciliidae Gambusia affinis

Taxon

Macleay River, New South Wales, Australia Stoney Creek (Brisbane River), Queensland, Australia Barambah Creek (Burnett River), Queensland, Australia

GenBank – New Zealand

Noosa River, Queensland, Australia Tweed River, New South Wales, Australia

GenBank – origin North and Central America Apple Creek (Burrum River), Queensland, Australia (introduction) GenBank – origin North and Central America GenBank – origin North and Central America

Collection site AP004422

cytb

Setiamarga et al. (2008)

Bai et al. (2009)

AP004108

HM007025

HM007024

HM006984

GU-AW281

HM0070667 HM006946 This study

GU-KR145

N/A

GU-KR001

Ishiguro et al. (2003) HM006945 This study

AP004108

GU-2111

GU-2340

N/A

N/A

GU-2312

N/A

Specimen number

HM0070677 HM006947 This study

HM0070233 HM0070657

AP004108

HM006985

HM006983

AP004108

HM006944 This study

AP005982

NC013089

HM0069821 HM0070225 HM007064

AP005982

NC013089

Miya et al. (2003)

Reference

HM006922 This study

AP004422

CR

HM006943 This study

AP005982

NC013089

HM0069984 HM007038

AP004422

ATP

HM0069811 HM0070214 HM007063

AP005982

NC013089

HM006956

AP004422

COI

Table I. Continued

2104 T. J . PA G E A N D J . M . H U G H E S

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Barambah Creek (Burnett River), Queensland, Australia GenBank – western Pacific

North Maroochy River, Queensland, Australia

Collection site

AP011064

HM006964

HM006975

COI

AP011064

HM007005

HM007016

ATP

AP011064

HM007046

HM007057

cytb

AP011064

HM0069388

CR

Yagishita et al. (2009)

This study

This study

Reference

N/A

GU-AW302

GU-AW263

Specimen number

A, after Allen et al. (2002); B, after Mabuchi et al. (2007); C, informal name. Possible hybrid, see Mather & Arthington (1991); D, after Allen et al. (2002); E, possible cryptic species, see Page et al. (2004); F, possible cryptic species, see Hammer et al. (2007); SEQ, south-east Queensland. Cytochrome c oxidase subunit 1 (COI) primer combinations for this study’s sequences: 1, FishF1–FishR2; otherwise FishF2–FishR1. ATPase (ATP) primer combinations for this study’s sequences: 2, ATP82L8331–COIII2H9236; 3, Lys.31F–CO3.62R; 4, Lys.22F–CO3.62R; 5, ATP82L8331– HCH; otherwise Lys.22F–CO3.23R. Cytochrome b (cytb) primer combinations for this study’s sequences: 6, HYPSLA–RF.Thr48; 7, HYPSLA–Ret.Thr31; otherwise HYPSLA–HYPSHD. Control region (CR) primer combinations for this study’s sequences: 8, Pro-L–CRMT16498H; otherwise CRL19–CRMT16498H. *Species native to south-east Queensland. †Native to Australia but not in south-east Queensland. ‡Exotic species introduced to south-east Queensland. §Specimen originally identified as Crenimugil crenilabis and revised to Mugil cephalus in Setiamarga et al. (2008). Native to Australia and translocated to south-east Queensland; otherwise exotic and not in south-east Queensland (Pusey et al., 2004).

Rhynchopelates oxyrhynchus

Terapontidae Leiopotherapon unicolor*

Scorpaenidae Notesthes robusta*

Taxon

Table I. Continued C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2105

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

2106

T. J . PA G E A N D J . M . H U G H E S

in south-eastern Queensland, but hailing from the same genera or families as those that are, were also included for the sequence divergence analyses, making a total of 74 specimens with sequences (42 from this project and 32 downloaded mitochondrial genomes). MTD NA

TECHNIQUES

Genomic DNA was extracted using a modified version of a standard cetyltrimethylammonium bromide (CTAB)-phenol–chloroform extraction (Doyle & Doyle, 1987). On the basis of the online searches, four mitochondrial gene regions were selected: (1) cytochrome c oxidase subunit 1 (COI), because of its adoption as the DNA barcode fragment (Ward et al., 2009) and the taxonomic breadth of sequences available, (2) cytb because of the large number of fish sequences available and fish papers that have used it, (3) ATPase subunit 6 (ATP), because, although not as widely used worldwide, it has been extensively used for Australian freshwater fish studies (McGlashan & Hughes, 2001; Wong et al., 2004) and (4) control region (CR; also known as D-loop), for similar reasons as cytb. Only sequence fragments of the three coding regions >500 base pairs (bp) were targeted (Ratnasingham & Hebert, 2007). All primer sequences are presented in Table II and the primer combinations which were used for each species are given in Table I. Polymerase chain reaction (PCR) cycling conditions were 3 min at 94◦ C, followed by 40 cycles of 30 s at 94◦ C, 30 s at 48◦ C (50◦ C for CR primers and HYPSLA–HYPSHD; 52◦ C for COI combinations), 45 s at 72◦ C (60 s for COI; 90 s for all cytb and ATP combinations except HYPSLA–HYPSHD and any combination using ATP82L8331) and then a final extension of 7 min at 72◦ C. PCR products were purified using exonuclease I and shrimp alkaline phosphatase. Sequencing reactions were done with BigDye v.3.1 Terminator mix (Applied Biosystems Inc.; www.appliedbiosystems.com) and the relevant forward primer and were cleaned up with ethanol precipitation as per the manufacturer’s instructions. Sequences were produced on an Applied Biosystems 3130xl Genetic Analyser at the DNA Sequencing Facility at Griffith University. Sequences were edited and aligned using Sequencher 4.1.2 (Gene Codes Corp.; www.genecodes.com).

T R E E A N A LY S E S Because of the very different natures of the protein-coding fragments (COI, cytb and ATP), which are easily amplified and aligned, and the non-coding CR, COI-cytb-ATP were all analysed together, and CR used separately only for the divergence analyses. Using the general methods of DNA barcoding (Ward, 2009), neighbour-joining trees were assembled in MEGA version 4 (Tamura et al., 2007) using the Kimura two parameter model (K2P) and bootstrapped 1000 times for all specimens that produced sequences for all three protein-coding genes. Bootstraps are only displayed at the family level and below because this simple treebuilding methodology is not phylogenetic but rather a phenetic clustering technique (Hamilton & Wheeler, 2008), and thus sequence saturation at deep systematic levels is highly probable (Hajibabaei et al., 2007). Although simplistic, a neighbour-joining and bootstrapping combination has proven effective (Munch et al., 2008; Ross et al., 2008), at least for species-level assignments. DNA barcoding analyses on COI of Australian fishes have shown that species cluster ‘invariably’ within a genus and ‘generally’ within a family (Ward et al., 2005). D I V E R G E N C E A N A LY S E S Sequence divergences were calculated using K2P distances in MEGA4 as used by Ward (2009) and BOLD. A Mantel’s test was performed in Primer version 5.2.8 (Primer-E Ltd; www.primer-e.com) to test the correlation of each of the three coding gene distance matrices with each other (1000 permutations of the Spearman rank correlation method in the Relate option). For the CR data set, sequences could only be aligned within a genus and so Mantel’s tests were performed within a genus between CR distances and relevant distances from the three coding genes. Predictive analytics software (PASW) Statistics 18 (SPSS Inc.; www.spss.com) was used to generate descriptive statistics (minimum, maximum and s.e.) between all species within © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

ATP82L8331 CO3.23R CO3.62R COIII2H9236 HCH Lys.22F Lys.31F FishF1 FishF2 FishR1 FishR2 CRL19 CRMT16498H Pro-L HYPSHD HYPSLA Ret.Thr31 RF.Thr48

ATP

cytb

CR

COI

Primer name

Region Forward Reverse Reverse Reverse Reverse Forward Forward Forward Forward Reverse Reverse Forward Reverse Forward Reverse Forward Reverse Reverse

Direction AAAGCRTYRGCCTTTTAAGC GGCTTGGGTCAACTATGTGGT TTATTAGAAGGGCGGCAACTG GTTAGTGGTCAKGGGCTTGGRTC TACTATGTGAAATGCGTGTG AAAGCGTTAGCCTTTTAAGC GCCTTTTAAGCTAAAGATTGG TCAACCAACCACAAAGACATTGGCAC TCGACTAATCATAAAGATATCGGCAC TAGACTTCTGGGTGGCCAAAGAATCA ACTTCAGGGTGACCGAAGAATCAGAA ACCACTAGCACCCAAAGCTA CCTGAAGTAGGAACCAGATG CTACCTCCAACTCCCAAAGC GGGTTGTTGGAGCCAGTTTCGT GTGGCTTGAAAAACCACCGTT CTCCAACCTCCGACTTACAAG GCAGTAGGAGGGAATTTAACCTTCG

Primer sequence

Primer reference S. McCafferty (unpubl. data) P. Unmack (unpubl. data) P. Unmack (unpubl. data) S. McCafferty (unpubl. data) McGlashan & Hughes (2001) P. Unmack (unpubl. data) P. Unmack (unpubl. data) Ward et al. (2005) Ward et al. (2005) Ward et al. (2005) Ward et al. (2005) Bernatchez & Danzmann (1993) Meyer et al. (1990) Palumbi et al. (1991) Thacker et al. (2007) Thacker et al. (2007) P. Unmack (unpubl. data) Unmack & Dowling (2010)

Table II. Mitochondrial primer sequences and sources (see Table I for species details)

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2107

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

2108

T. J . PA G E A N D J . M . H U G H E S

families (including all genera), between different genera within families and within genera for each gene region (within genera only for CR). Linear regressions were performed in PASW to assess the differing relative levels of divergence of the three coding genes between species within families. To avoid non-independence of comparisons in the linear regressions, each specimen was only used in a single pair-wise comparison within a family and was chosen randomly without replacement using the PopTools version 3.1 [Commonwealth Scientific and Industrial Research Organisation (CSIRO)] corrected random function in Excel 3.1 (Microsoft Corporation; www.microsoft.com).

RESULTS O N L I N E D ATA B A S E S E A R C H E S

By far, the two most commonly used mitochondrial gene regions for Journal of Fish Biology phylogeography or phylogeny papers published from 2000 to 2010 (through to volume 76 issue 2) have been cytb (49 papers) and CR (42), followed distantly by ATP (seven), with COI in seventh place with four papers (see Fig. 2 for all results). This does not include papers specifically written as DNA barcoding-only papers that do not include phylogeography or phylogeny in their topics. Another analysis of fish papers also found cytb the most commonly used and COI seventh (Teletchea, 2009). The EMBL search found that cytb and COI had similar numbers of sequences publicly available for ray-finned fishes (57 513 and 57 155, respectively), while there were 27 082 CR and 5463 ATP sequences. This is only a raw sequence count and does not deal with the relative taxonomic coverage of these sequences nor will they all be overlapping fragments (as the barcode sequences will be).

Phylogeography or phylogeny papers in JFB 2000–2010

50

49

45

42

40 35 30 25 20

Mitochondrial

15 10

7

6

5

6

5

4

4

4

4

3

3 1

0 Cytb

CR ATP6–8 ND1

16S 12S COI Genome ND2 ND5–6 ND3 ND4– 4L COII Mitochondrial gene regions used

Fig. 2. Mitochondrial gene regions used in fish phylogeography or phylogeny papers published between 2000 and 2010 in the Journal of Fish Biology.

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

C O M PA R I N G F R E S H WAT E R F I S H M I T O C H O N D R I A L G E N E S

2109

The BOLD search found 65 502 COI sequences (14 552 publicly available) from 10 802 species. S E QU E N C E P RO D U C T I O N

All 42 of the specimens from this study were sequenced for COI and cytb and all but two for ATP (see Table I for details and GenBank accession numbers). CR proved more challenging with 32 of 42 specimens sequenced. All protein-coding sequences were translated to amino acids in MEGA4 (vertebrate mitochondrial genetic code) with no stop codons nor indels present. COI sequences (602 bp; codon start position 2) correspond to positions 5568 to 6169 of the Lake Kutubu rainbowfish Melanotaenia lacustris Munro, mtDNA genome (accession number AP004419; Miya et al., 2003) and positions 102 to 703 of the DNA barcode for fishes (Ward & Holmes, 2007). ATP sequences (539 bp; codon start position 1) are positions 8101 to 8639 and cytb (504 bp; codon start position 1) positions 14 445 to 14 948 of the M. lacustris genome. CR sequences were of varied lengths (327 to 409 bp) and roughly correspond to positions 15 677 to 16 021 of M. lacustris. All morphological identifications were double checked by comparing all resulting sequences to the BOLD online database and GenBank (BLASTn search at: blast.ncbi.nlm.nih.gov), as well as to unpublished sequences from the present and other laboratories. All identifications were confirmed, except an Oreochromis specimen (specimen number GU-AW421), originally identified as the Mozambique tilapia Oreochromis mossambicus (Peters), but which may be a hybrid with another species such as the blue tilapia Oreochromis aureus (Steindachner) or the Nile tilapia Oreochromis niloticus (L.) (Mather & Arthington, 1991), and a Geophagus specimen (number GU-AW303) which could not be assigned to any species [perhaps the pearl cichlid Geophagus brasiliensis (Quoy & Gaimard)]. T R E E R E S U LT S

A data set of the protein-coding genes of 72 taxa was assembled in MEGA4 from all specimens with COI, ATP and cytb sequences (40 from this study and 32 from downloaded mitochondrial genomes). Neighbour-joining trees were produced for all three genes together (Fig. 3), all genes separately and all two-gene combinations (see Appendix SI for all trees). Although DNA barcoding sensu stricto is primarily concerned with the identification at the species level (Hubert et al., 2008), mitochondrial sequences clustered using K2P distances can betray a systematic signal at the generic, and even family, levels (Frezal & Leblois, 2008), which can be useful in preliminary identifications, particularly for partial specimens such as fin clips, before assigning to species. The performance of each gene on its own (and all two-gene combinations) to cluster together family units (per Pusey et al., 2004) was compared to the three gene data set using monophyly and bootstrap support for each family (Table III). It is akin to imagining that any one species per family was an unidentified specimen and then determining where it would end up on a tree. Of the 18 families present in this three gene data set, nearly all (16) were recovered with strong support, one with moderate support (Chandidae, but strongly supported © 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

2110

T. J . PA G E A N D J . M . H U G H E S

99

98

99 82 44 99 84 99 93

99

80

99

73 42 62 69

99 99

71 99

99 82 93

99

99 99

99

59

99

52 82

99 99 94

99 0·02

99

89

Carassius auratus Carassius cuvieri Cyprinus carpio Cyprinidae 99 Xenocypris argentea Xenocypris davidi Anguilla reinhardtii Anguilla anguilla Anguillidae Anguilla australis Neoceratodus forsteri Ceratodontidae Arapaima gigas Osteoglossum bicirrhosum Scleropages formosus Osteoglossidae Scleropages leichardti Megalops atlanticus Megalopidae Megalops cyprinoides Neosilurus hyrtlii Tandanus tandanus Plotosidae Porochilus rendahli 99 Mogurnda adspersa Mogurnda mogurnda Philypnodon grandiceps Philypnodon macrostomus Gobiomorphus australis Eleotridae Gobiomorphus coxii Eleotris acanthopoma Hypseleotris sp. Midgley's Hypseleotris compressa Hypseleotris galii Dorosoma cepedianum Dorosoma petenense Ethmalosa fimbriata Clupeidae Nematalosa erebi Nematalosa japonica Galaxias maculatus Galaxiella nigrostriata Galaxiidae Retropinna retropinna Retropinna semoni SEC1 Retropinnidae 99 Retropinna semoni SEQ2 Retropinna semoni SEQ1

99

97

Lates calcarifer Centropomidae Nannoperca australis Nannoperca oxleyana Percichthyidae Leiopotherapon unicolor Rhynchopelates oxyrhynchus Terapontidae 99 Gambusia affinis Gambusia holbrooki Poeciliidae 99 Xiphophorus hellerii Xiphophorus maculatus Ambassis marianus 99 Ambassis agassizi Chandidae Ambassis sp. NW2 Mugil cephalus 99 Mugil cephalus Australia Mugilidae Notesthes robusta Scorpaenidae Geophagus sp. Hypselecara temporalis Paratilapia polleni Cichlidae Oreochromis sp TP3 Oreochromis sp KM4 Craterocephalus stercusmuscarum Craterocephalus stramineus Atherinidae Craterocephalus marjorae Hypoatherina tsurugae Pseudomugil mellis Pseudomugil signifer Pseudomugilidae 79 Rhadinocentrus ornatus CEQ5 99 Rhadinocentrus ornatus SER5 Rhadinocentrus ornatus SEQ5 Melanotaeniidae Melanotaenia lacustris 99 Melanotaenia splendida 99 Melanotaenia duboulayi 94 Melanotaenia fluviatilis

Fig. 3. Neighbour-joining tree produced from combined data set of COI, ATP and cytb genes (see Table I), showing families and bootstrap support values. All drawings are reproduced with permission from Pusey et al. (2004), except Galaxiidae (with permission from McDowall, 1990), Cichlidae, Cyprinidae and Poeciliidae adapted from www.FishBase.org by R. Cada. 1, possible cryptic species, see Hammer et al. (2007); 2, after Allen et al. (2002); 3, informal name. Possible hybrid, see Mather & Arthington (1991); 4, after Mabuchi et al. (2007); 5, possible cryptic species, see Page et al. (2004).

© 2010 The Authors Journal of Fish Biology © 2010 The Fisheries Society of the British Isles, Journal of Fish Biology 2010, 77, 2093–2122

Anguillidae Atherinidae Chandidae Cichlidae Clupeidae Cyprinidae Eleotridae Galaxiidae Megalopidae Melanotaeniidae Mugilidae Osteoglossidae Percichthyidae Plotosidae Poeciliidae Pseudomugilidae Retropinnidae Terapontidae Total families supported S = strong (80–100%) M = moderate (50–79%) W = weak (