How to do BPA, really

0 downloads 0 Views 642KB Size Report
Next, we construct a taxon area cladogram (Enghoff, 1996; Van Veller et al.,. 1999) based on the ..... parsimonious answer for secondary BPA. Consider a slight.

Journal of Biogeography, 28, 345±358

How to do BPA, really Daniel R. Brooks1, Marco G. P. van Veller2 and Deborah A. McLennan1 1Department of Zoology, University of Toronto, M5S 3G5 Toronto, Ontario, Canada and 2Section Theoretical Biology and Phylogenetics, Institute of Evolutionary and Ecological Sciences, Leiden University, PO Box 9516, 2300 RA Leiden, The Netherlands

Abstract Aim Recent comparisons of different approaches to historical biogeography have suffered in part because Brooks Parsimony Analysis (BPA) has been characterized as a one-step process following protocols proposed in 1981. Subsequent modi®cations have resulted in a two-step methodology. This contribution presents the mechanics and applications of those modi®cations. Methods The ®rst step, or Primary BPA, which is similar to the original BPA but with modi®cations proposed by Wiley (1986, 1988a, b), is used to assess whether or not there is support for a single general area cladogram. The second step, Secondary BPA, proposed by Brooks (1990), depicts exceptions to the general area cladogram explicitly by duplicating areas having a reticulate history. Results The analytical basis of area duplication in secondary BPA is explained more fully than in previous accounts, and the manner in which secondary BPA explicitly depicts falsi®cation of the null hypothesis of simple vicariance is presented for four general cases. Main conclusions BPA, as fully implemented, is capable of accounting for the complexity of speciation, dispersal and extinction events in a historical biogeographic context without removing or modifying input data from basic phylogenies, so long as at least three clades are analysed simultaneously to provide a distinction between general and special distribution elements. Keywords Historical biogeography, Hypothetico-deductive approaches, Brooks Parsimony Analysis, A posteriori methods, Parsimony, Logical consistency, Phylogenetic systematics, Null hypothesis.

INTRODUCTION Recent discussions of different approaches to studies of historical biogeography (Morrone & Carpenter, 1994; Morrone & Crisci, 1995; Hovenkamp, 1997; Page & Charleston, 1998) have been confused and confusing. One reason for this has been the failure to recognize some approaches as inductive/veri®cationist and others as hypothetico-deductive/falsi®cationist (Van Veller & Brooks submitted). Another reason has been under- or mis-representation of the mechanics and therefore the general properties of approaches derived from phylogeCorrespondence: Daniel R. Brooks, Department of Zoology, University of Toronto, M5S 3G5 Toronto, Ontario, Canada. E-mail: [email protected]

Ó 2001 Blackwell Science Ltd

netic systematic principles, in particular the approach Wiley (1986, 1988a, b) called Brooks Parsimony Analysis (BPA). In particular, all of the studies purporting to compare BPA with other methods (e.g. Morrone & Carpenter, 1994; Morrone & Crisci, 1995; Page & Charleston, 1998) have used the original formulation of BPA by Brooks (1981), ignoring modi®cations of BPA 2 implemented by Cressey et al. (1983), O'Grady & Deets (1987), Wiley (1986, 1988a, b) and Brooks (1990). We believe that at least part of the blame for this confusion stems from incomplete discussions of BPA, especially secondary BPA (Brooks, 1990; Brooks & McLennan, 1991). In this paper, we hope to rectify those problems, mostly to encourage more historical biogeographers to use hypothetico-deductive rather than inductive approaches, also to make certain that future comparisons of BPA with

346 D. R. Brooks et al.

other methods are carried out properly, and to encourage the development of one or more computer algorithms for secondary BPA. FINDING GENERAL PATTERNS: PRIMARY BPA Brooks (1981, 1985) presented BPA as a direct application of phylogenetic systematic methodology. Consider the case of the clade of ®ve species depicted in Fig. 1, each of which lives in a particular area (Table 1). The phylogenetic relationships of ®ve species (Fig. 1) can be treated as if they were a completely polarized multistate transformation series, coded in a manner that indicates both the identity and common ancestry of each species (Fig. 2). These codes can be represented in an additive binary matrix (Table 2). The phylogenetic relationships of the study clade are now represented by the binary codes. Now we replace the species names in Table 2 with their geographical distributions (Table 3). Next, we construct a taxon area cladogram (Enghoff, 1996; Van Veller et al., 1999) based on the phylogenetic relationships of the species (Fig. 3). This produces a picture of the historical involvement of areas in the evolution of the species. Van Veller & Brooks (in press) referred to BPA performed in the assumption that each area has a single history as Primary BPA. Even such a simple pattern of occurrence could be the result of a variety of evolutionary processes. If our goal is a

Figure 2 Phylogenetic tree for species 1±5, with internal branches numbered for matrix representation.

Table 2 Additive binary codes for species 1±5 and their phylogenetic relationships Species

Binary code

1 2 3 4 5

100001001 010001001 001000011 000100111 000010111

Table 3 Areas in which species 1±5 occur coded for primary BPA Areas

Binary code

A B C D E

100001001 010001001 001000011 000100111 000010111

Figure 1 Phylogenetic tree for species 1±5.

Table 1 Occurrences of species 1±5 in areas A±E Areas

Species

A B C D E

1 2 3 4 5

Figure 3 Area cladogram for areas A±E based on phylogenetic relationships of species 1±5. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 347

scienti®c explanation then, once again following phylogenetic systematic principles, we must construct general hypotheses and subject them to empirical tests (the Principle of Falsi®cation: Popper, 1960, 1968a, b, 1972, 1976, 1992; Wiley, 1975; Platnick & Gaffney, 1977, 1978a, b; Gaffney, 1979; Eldredge & Cracraft, 1980; Wiley, 1981; Kluge, 1997, 1998a). The minimum number of apomorphies needed to resolve a three-taxon statement is two; a ®rst apomorphy to support the monophyly of the three-taxon group and a second to indicate sister group relationships between two of the three members of the clade. Resolving and corroborating that three-taxon statement requires at least a third apomorphy supporting the second. Likewise, the minimum number of clades needed to establish a general pattern of historical biogeographic distributions, which can then be subjected to empirical test and possible falsi®cation, is three (Brooks & McLennan, 1991). General patterns constitute instances in which members of different clades have a common history of speciation with respect to the areas in which they occur. The null hypothesis, therefore, is that each area has a single history with respect to all the species that inhabit it. For BPA, the general pattern functions as the null hypothesis in analogy with Hennig's Auxiliary Principle (Hennig, 1966), which states that in the absence of other evidence, similarity is homologous similarity, or common patterns indicate common history of association. Such general patterns can be produced by several modes of speciation, but only vicariant speciation is always expected to produce general patterns (Wiley, 1981; Wiley & Mayden, 1985; Brooks & McLennan, 1991), so in practice common patterns are presumed to be the result of vicariant speciation unless special circumstances are speci®ed. Table 4 lists the occurrences of species representing two hypothetical clades in ®ve areas. Figure 4 depicts the

phylogenetic trees for the clades containing species 1±5 and species 10±14. Table 5 lists the binary codes for members of each clade for each area. Figure 5 portrays the area cladogram that results from primary BPA of this data matrix. This represents an example of two co-occurring clades whose members have speciated in the same manner with respect to the geographical distributions of sister species. In the next example, two clades of species are not equally represented throughout the areas under investigation. Speci®cally, the members of one clade occur in ®ve areas (A±E) and the members of the other clade occur in only four areas (A±D). Figure 6 depicts the phylogenetic trees for the clades containing species 1±5 and species 10±13. Table 6 Table 5 Matrix listing binary codes for members of clades 1±5 and 10±14 inhabiting areas A±E Areas

Binary code

A B C D E

100001001100001001 010001001010001001 001000011001000011 000100111000100111 000010111000010111

Table 4 Occurrences of species representing two hypothetical clades (1±5 and 10±14) in areas A±E Areas

Clade 1

Clade 2

A B C D E

1 2 3 4 5

10 11 12 13 14

Figure 4 Phylogenetic trees for species 1±5 and 10±14, with internal branches numbered for matrix representation. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

Figure 5 Area cladogram for areas A±E based on phylogenetic relationships of species 1±5 and 10±14.

348 D. R. Brooks et al.

Figure 6 Phylogenetic trees for species 1±5 and 10±13, with internal branches numbered for matrix representation.

Table 6 Matrix listing areas A±E, the species that inhabit them (1±5 and 10±13), and the binary codes representing those species and their phylogenetic relationships Areas

Taxa

Binary code*

A B C D E

1, 2, 3, 4, 5

1000000011000001 0100000110100011 0010001110010111 0001011110001111 000011111???????

10 11 12 13

*? ˆ taxa missing from an area.

lists the binary codes for the members of clades 1±5 and 10± 13 in each area. Areas lacking members of one clade are coded as missing data (`?') as suggested by Wiley (1988a, b), so as not to bias the analysis with a priori assertions of either extinction of dispersal. The area cladogram produced by BPA of this data matrix is shown in Fig. 7. Note that there is no hypothesis concerning either the presence or absence of species 10±13 in area E. Unless we are simply going to ignore data, some explanation must be offered a posteriori. In this, we might propose either that no members of species 10±13 ever inhabited area E or that some member of the clade reached it and then became extinct. The ®rst proposal implies that area E is not part of the general pattern, whereas the second proposal suggests that Area E is part of the general pattern. We now add a third clade (species 17±21, Fig. 8). In this case, there is a member of the clade in all ®ve areas. A new matrix incorporating information from all three clades is shown in Table 7. The new area cladogram (Fig. 9) now indicates that the absence of a member of clade 10±13 in area is the unusual observation needing special explanation. In other words, the extinction of a member of clade 10±13 is a more parsimonious explalel episodes of speciation by dispersal producing species 5 and species 21. FINDING AND REPRESENTING FALSIFICATIONS: SECONDARY BPA BPA is based on the assumption that species are ontological individuals (Ghiselin, 1974; Hull, 1976, 1978, 1980; Wiley,

Figure 7 Area cladogram for areas A±E based on phylogenetic relationships of species 1±5 and 10±13.

Figure 8 Phylogenetic tree for species 17±21, with internal branches numbered for matrix representation.

1978, 1980a, b, 1989; Mishler & Donoghue, 1982; Cracraft, 1983a, 1987; Donoghue, 1985; McKitrick & Zink, 3 1988; 1989a, b; LoÈther, 1990; Frost & Kluge, 1994; 4 O'Hara, 1994; Mayden & Wood, 1995; Mayden, 1997, 1999; Brooks & McLennan, 1999); therefore they and the Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 349 Table 7 Matrix listing areas A±E, the species that inhabit them (1±5 and 10±13), and the binary codes representing those species and their phylogenetic relationships Areas

Taxa

Binary code*

A B C D E

1, 2, 3, 4, 5

1000000011000001100000001 0100000110100011010000011 0010001110010111001000111 0001011110001111000101111 000011111???????000011111

10 11 12 13

*? ˆ taxa missing from an area.

Figure 9 General area cladogram for areas A±E based on phylogenetic relationships of species 1±5, 10±13 and 17±21.

speciation events that form them are evolutionarily independent variables. This analogue of Kluge's Auxiliary Principle (always assume independence of data a priori: Wiley et al., in press) permits individual species or clades to falsify the null hypothesis. The distribution of any particular species in any particular clade therefore is a potential falsi®er of the null hypothesis. No matter how many species in how many clades demonstrate a general pattern, any single species in any clade can falsify the null hypothesis. For this reason, we must use all available data (the Principle of Total Evidence: Kluge, 1989b), and cannot remove some information a priori as suggested by, e.g. Kluge (1988), Nelson & Platnick (1981), Nelson & Ladiges (1991a, b, 1996), and Hovenkamp (1997). The three-clade example above demonstrates an additional important point. Episodes of extinction do not falsify the null hypothesis of simple vicariance, as the absence of evidence can neither corroborate nor falsify a hypothesis. Component Analysis takes advantage of this by postulating arbitrary numbers of lineage duplications and extinctions to protect the a priori hypothesis from falsi®cation. True falsi®cation of the null hypothesis of simple vicariance comes in the form of episodes of pre- and post-speciation dispersal (range expansion following speciation, speciation by dispersal, or non-response to a vicariant event). Examples appear in biogeographic analyses as `redundant species', in Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

which more than one member of a given clade occurs in the same area, or members of different clades indicate different area relationships for the same area, and `widespread species', in which the same species occur in more than one area. Such falsi®ers appear as homoplasy in primary BPA, produced as a result of combining codes for each species occurring in each area, or inclusive ORing (Cressey et al., 1983; O'Grady & Deets, 1987). Primary BPA can indicate whether or not the data support a general pattern of area relationships, and whether or not there are exceptions (falsi®ers) to that pattern. But it does not permit us to represent those falsi®cations in an effective manner. For that, we need to go beyond the constraints of inclusive ORing. Problems caused by inclusive ORing arise because primary BPA constrains explanations to an a priori assumption that areas cannot have reticulated histories. And yet, the hypothesis that we wish to test and potentially falsify is that each area has a single history with respect to all species being analysed. Brooks (1990), see also Brooks & McLennan (1991) resolved this methodological and conceptual problem by proposing that we represent each instance of falsi®cation with an extra representation of the area having the reticulate history. Van Veller & Brooks (in press) refer to this as secondary BPA. The function of primary BPA is thus to ®nd if there is a general pattern; the function of the secondary BPA is to represent clearly all exceptions to that general pattern. These exceptions call into four general categories, and we next provide an exemplar of each. `REDUNDANT TAXA' I: SINGLE CLADES WITH SPECIES INDICATING DIFFERENT AREA RELATIONSHIPS Figure 10 depicts the phylogenetic tree for hypothetical species 1±6. The matrix depicting the geographical distributions of species 1±6 among areas A±D is shown in Table 8. The area cladogram resulting from primary BPA (Fig. 11) has a consistency index of 100%. Area B, however, contains three

Figure 10 Phylogenetic tree for species 1±6, with internal branches numbered for matrix representation.

350 D. R. Brooks et al. Table 8 Matrix listing the geographical distribution of species 1±6 among areas A±D, along with the binary codes representing the phylogenetic relationships among species 1±6 Areas

Taxa

Binary code

A B C D

1 2, 5, 6 3 4

10000000001 01001111111 00100000111 00010001111

Figure 12 Area cladogram for areas A±D, based on phylogenetic relationships of species 1±6, and treating area B as three separate areas.

areas B2 + B3 as a single area, simplifying the analysis. In this manner secondary BPA corrects for too many assumed reticulated histories (see also below). `REDUNDANT TAXA' II: MULTIPLE CLADES INDICATING DIFFERENT RELATIONSHIPS AMONG THE SAME AREAS Figure 11 Area cladogram for areas A±D, based on phylogenetic relationships of species 1±6.

Table 9 Matrix listing the geographical distribution of species 1±6 among areas A±D, with area B treated as three separate areas, along with the binary codes representing the phylogenetic relationships among species 1±6 Areas

Taxa

Binary code

A B1 C D B2 B3

1 2 3 4 5 6

10000000001 01000000011 00100000111 00010001111 00001011111 00000111111

members of the clade which are not sister species, so the consistency index does not actually indicate complete support for a simple vicariance hypothesis. Treating area B as three areas, B1 for species 2, B2 for species 5 and B3 for species 6 produces a new data matrix (Table 9). Secondary BPA produces the area cladogram depicted in Fig. 12, in which area B1 occurs in a relatively basal position, while areas B2 + B3 are sister areas in a relatively derived position. The relatively derived species 5 and 6 may occur in area B because their ancestor (species 7), which links areas B2 + B3, dispersed from area D into area B, and subsequently speciated, producing species 5 and 6. We can thus consider

Because the above are analyses of single clades, we do not know which of the duplicated areas are part of the general pattern and which are the special cases. In the next example, we consider four clades each comprising four terminal species, each species living in a single area. Figure 13 depicts the phylogenetic trees for hypothetical species 1±4, 8±11, 15±18 and 22±25. The matrix depicting the geographical distributions of species 1±4, 8±11, 15±18 and 22±25 among areas A±D is shown in Table 10. The general area cladogram constructed by primary BPA (Fig. 14) is congruent with the area relationships supported by species 1±4, 8±11 and 15±18, but differs from that supported by species 22±25, denoted by homoplasious occurrence of ancestral species 26 and 27. This causes us to reject the null hypothesis that all observed distributions are the result of a single history of vicariance for each area. This example is simple enough that most readers will see that it is area A that has a reticulate history (`A' for species 1±4, 8±11 and 15±18 is not the same as `A' for species 22±25). How do we arrive at that conclusion analytically? There are 15 possible combinations of area duplications (ABCD, ABC, ABD, ACD, BCD, AB, AC, AD, BC, BD, CD, A, B, C and D) that can be applied to the general pattern from the primary BPA, onto which we can then optimize the data from the phylogenies (Table 10). Whenever the duplicates of the same area are connected to the same node, we combine the areas. In this case, the 15 combinations produce four different area cladograms (Figs 15±18). The ®rst is the same area cladogram as the primary BPA, which we reject because it contains homoplasy indicating reticulate histories Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 351

Figure 13 Area cladograms for areas A±D based on phylogenetic relationships of species 1±4, 8±11, 15±18 and 22±25.

Table 10 Matrix listing the geographical distribution of species 1±6, 8±11, and 22±25 among areas A±D, along with the binary codes representing the phylogenetic relationships among species 1±6, 8±11, and 22±25 Areas

Taxa

Binary code

A B C D

1, 2, 3, 4,

1000001100000110000010001111 0100011010001101000111000001 0010111001011100101110100011 0001111000111100011110010111

8, 15, 25 2, 9, 16, 22 3, 10, 17, 23 11, 18, 24

Figure 14 General area cladogram for areas A±D derived from primary BPA of species 1±4, 8±11, 15±18 and 22±25, and their phylogenetic relationships. Bold-faced numbers with asterisks represent phylogenetic relationships incongruent with the general area cladogram. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

Figure 15 General area cladogram for areas A±D derived from secondary BPA of species 1±4, 8±11, 15±18, and 22±25, and their phylogenetic relationships, based on assuming that areas B, C and D (BCD at base of area cladogram), areas B and C (BC at base of area cladogram), areas B and D (BD at base of area cladogram), areas C and D (CD at base of area cladogram), area B (B at base of area cladogram), C (C at base of area cladogram), or D (D at base of area cladogram) have reticulate histories. The single area cladogram results from combining duplicated areas that appear as sister-areas or in an unresolved polytomy. Note that this area cladogram is the same as generated by primary BPA (Fig. 14).

352 D. R. Brooks et al.

Figure 16 General area cladogram for areas A±D derived from secondary BPA of species 1±4, 8±11, 15±18 and 22±25, and their phylogenetic relationships, in which it is assumed that areas A, B, C and D have a reticulate history (denoted by ABCD at the base of area cladogram).

Figure 18 General area cladogram for areas A±D derived from secondary BPA of species 1±4, 8±11, 15±18 and 22±25, and their phylogenetic relationships, based on assuming that areas A, B and C (ABC at base of area cladogram), A, B, and D (ABD at base of area cladogram), A and B (AB at base of area cladogram), A and C (AC at base of area cladogram), A and D (AD at base of area cladogram), or A (A at base of area cladogram) have reticulate histories. The single area cladogram results from combining duplicated areas that appear as sister-areas or in an unresolved polytomy. Table 11 Matrix listing the geographical distribution of species 1±6, 8±11, and 22±25 among areas A±D, with area A treated as two separate areas, along with the binary codes representing the phylogenetic relationships among species 1±6, 8±11, and 22±25

Figure 17 General area cladogram for areas A±D derived from secondary BPA of species 1±4, 8±11, 15±18 and 22±25, and their phylogenetic relationships, based on assuming that areas A, C and D (ACD at base of area cladogram) have reticulate histories. The single area cladogram results from combining duplicated areas that appear as sister-areas or in an unresolved polytomy.

for at least one area but does not depict any reticulations (Fig. 15, for duplication combinations BCD, BC, BD, CD, B, C and D). The remaining three area cladograms eliminate the dispersal-based homoplasy. These include (1) an area cladogram in which the biogeography of species 22±25 are viewed as completely independent of the other clades (Fig. 16, for duplication combination ABCD, which is also the solution favoured by Component 1.5; Page, 1990) (2) an area cladogram in which areas D and A are duplicated (Fig. 17, for duplication combinations ACD, ABD and AD), and (3) an area cladogram in which area A is duplicated (Fig. 18, for duplication combinations ABC, AB, AC and A). Figure 16 requires four area duplications, Fig. 17 requires two area duplications and Fig. 18 requires only single area duplication.

Areas

Taxa

Binary code

A B C D A1

1, 8, 15 2, 9, 16, 22 3, 10, 17, 23 4, 11, 18, 24 25

100000110000011000001??????? 0100011010001101000111000001 0010111001011100101110100011 0001111000111100011110010111 ?????????????????????0001111

? ˆ taxa missing from an area.

The taxon duplication convention may seem arbitrary, but it is not. Rather, it is an extension of the principle of logical parsimony. Our null hypothesis is that all areas have a single history with respect to the species that inhabit them. When the data do not support that null hypothesis, we duplicate areas to indicate the exact departures from (falsi®cations of) the null hypothesis, but we do not multiply areas beyond necessity. Figure 18 is thus the preferred area cladogram for this example (Table 11 is the secondary BPA matrix). `WIDESPREAD TAXA' I: SINGLE CLADES WITH WIDESPREAD SPECIES Figure 19 depicts a phylogenetic tree for hypothetical species 1±4. Table 12 lists the data matrix for the areas and the species (plus the codes for their phylogenetic relationships) that inhabit them. Phylogenetic analysis of this data matrix produces the area cladogram depicted in Fig. 20. In this instance, the area cladogram has a consistency index of Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 353 Table 12 Matrix listing the geographical distribution of species 1±4 among areas A±D, along with the binary codes representing the phylogenetic relationships among species 1±4 Areas

Taxa

Binary code

A B C D

1 1, 2 1, 3 4

1000001 1100011 1010111 0001111

sister-clade to the remaining species (Fig. 19). Hence, while the most parsimonious interpretation based simply on the occurrence of species in particular areas supports the postulate of extinction in area D, phylogenetic evidence rules against it. This is the essence of what Wiley (1986, 1988a, b) called Assumption 0, in contrast to Assumptions 1 and 2 of Nelson & Platnick (1981) and subsequent a priori approaches (see van Veller & Brooks in press): it is not permissible to change postulated phylogenetic relationships simply because they falsify an a priori expectation of simple vicariance. The taxon duplication convention permits us to avoid postulating unsupported phylogenetic relationships within the context of the historical biogeographical analysis. Figure 21 depicts the area cladogram for areas A±D, with numbers superimposed indicating the species in clade 1±4 that occur in each area. Table 13 shows areas B and C duplicated, once for each species occurring in the area. The new area cladogram (Fig. 22) shows areas A, B2 and C2 as sister species, supporting an explanation that species 1 dispersed, post-speciation, into areas B and C.

Figure 19 Cladogram for areas A±D, with distributions of species 1±4 among those areas superimposed.

Figure 21 Area cladogram for areas A±D produced by primary BPA based on phylogenetic relationships of species 1±4. Note that species 1 appears as the ancestor of species 6 rather than as its sister species, as depicted in Fig. 20.

Table 13 Matrix listing the geographical distribution of species 1±4 among areas A±D, with areas B and C each treated as two separate areas, along with the binary codes representing the phylogenetic relationships among species 1±4 Figure 20 Phylogenetic tree for species 1±4 with internal branches numbered for matrix representation.

100%; nevertheless, something is amiss, because interpreting the absence of species 1 in area D as a reversal, or extinction event, in that area requires placing species 1 in a position ancestral to species 2, 3 and 4. This con¯icts with the phylogenetic tree for the clade, which places species 1 as the Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

Areas

Taxa

Binary code

A B1 B2 C1 C2 D

1 2 3 4 5 6

1000001 0100011 1000001 0010111 1000001 0001111

354 D. R. Brooks et al. Table 14 Matrix listing the geographical distribution of species 1±4, 8±11 and 15±18 among areas A±D, along with the binary codes representing the phylogenetic relationships among species 1±4, 8±11 and 15±18 Areas

Taxa

Binary code

A B C D

1,8,11,15 3,9,10,17 2,9,16 1,4,11,17,18

100000110011111000001 001011101101110010111 010001101010010100011 100111100011110011111

Figure 22 Area cladogram for areas A±D produced by secondary BPA based on phylogenetic relationships of species 1±4. Note that duplicating areas B and C (i.e. assuming that species 1 dispersed, postspeciation, into areas B and C) produces an accurate depiction of the phylogenetic relationships for species 1±4 as depicted in Fig. 20.

`WIDESPREAD TAXA' II: MULTIPLE CLADES WITH WIDESPREAD SPECIES Although many clades can be affected in a similar manner by the same large-scale vicariance events, each species within each clade is an independent evolutionary system (Kluge's Auxiliary Principle). It is, therefore, not unexpected that BPA of multiple clades will be faced with a mosaic of widespread species. Figure 23 comprises three clades of four species each occurring over four different areas. At least one species in each clade occurs in more than one area (Table 14 is the primary BPA matrix). A single area cladogram results from primary BPA of this matrix (Fig. 24), but the widespread taxa produce substantial homoplasy. Performing the same iterative process as outlined above for determining the minimum number of area duplications needed to account for the homoplasy produces the area cladogram shown in Fig. 25, in which areas A and B are duplicated once and area D is duplicated twice (Table 15 is the secondary BPA matrix). We explain these distributions by invoking four cases of post-speciation dispersal, by species 1 from area A to area D, species 9 from area C to area B, species 11 from area D to area A, and species 17 from area B to area D.

Figure 24 General area cladogram for areas A±D produced by primary BPA based on phylogenetic relationships of species 1±4, 8±11 and 15±18. Bold-faced numbers with asterisks represent phylogenetic relationships incongruent with the general area cladogram.

PUTTING IT ALL TOGETHER Each exception to simple vicariance corresponds to one of the four exemplars above. Arbitrary amounts of historical complexity can be analysed and explained most parsimoniously using primary and secondary BPA in the manner described herein. Consider ®ve clades comprising missing, widespread, and redundant species (Fig. 26; Table 16 is the primary BPA matrix). Primary BPA produces a single area cladogram (Fig. 27) but, as in the previous case, there is considerable homoplasy. Secondary BPA resolves that ambiguity by duplicating area A once and areas B and D three times (Fig. 28; Table 17 is the secondary BPA matrix). The biogeographic history of these ®ve clades includes one

Figure 23 Phylogenetic trees for species 1±4, 8±11 and 15±18 with internal branches numbered for matrix representation and geographical distributions among areas A±D superimposed. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 355

Figure 25 Area cladogram for areas A±D produced by secondary BPA based on phylogenetic relationships of species 1±4, 8±11 and 15±18. Note that representing areas A and B twice and area D three times produces an accurate depiction of the phylogenetic relationships for species 1±4, 8±11 and 15±18 as depicted in Fig. 23.

Table 15 Matrix listing the geographical distribution of species 1±4, 8±11 and 15±18 among areas A±D, with areas A and B each treated as two separate areas and area D treated as three separate areas, along with the binary codes representing the phylogenetic relationships among species 1±4, 8±11 and 15±18 Area

Taxa

Binary code

A1 A2 B1 B2 C D1 D2 D3

1, 8, 15 11 3, 10, 17 9 2, 9, 16 4, 11, 18 1 17

100000110000011000001 ???????0001111??????? 001011100101110010111 ???????0100011??????? 010001101000110100011 000111100011110001111 1000001?????????????? 000000000000000010111

? ˆ taxa missing from an area.

Figure 26 Phylogenetic trees for species 1±4, 8±11, 15±18, 22±26 and 31±33 with internal branches numbered for matrix representation and geographical distributions among areas A±D superimposed. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

extinction (no sister species of species 33 in area D), one instance of speciation by dispersal (species 23 in area B), and four cases of post-speciation dispersal (species 1 in area D, species 9 in area B, species 11 in area A and species 17 in area D). SUMMARY Primary BPA ®nds the most parsimonious general area cladogram or cladograms possible given any set of phylogenetic relationships and geographical distributions for three or more clades (see Van Veller & Brooks in press). Primary BPA also determines if there is any falsi®cation of the null hypothesis of simple vicariance, in the form of homoplasy that cannot be explained as secondary extinction. Secondary BPA integrates the incongruent elements with the general

356 D. R. Brooks et al. Table 16 Matrix listing the geographical distribution of species 1±4, 8±11, 15±18 and 22±25 among areas A±D, along with the binary codes representing the phylogenetic relationships among species 1±4, 8±11, 15±18 and 22±25 AreaTaxa A B C D

1, 3, 2, 1,

8, 9, 9, 4,

Binary code 11, 10, 16, 11,

15, 17, 24, 17,

22, 29 100000110011111000001100000110001 23, 30 001011101101110010111010001101011 31 010001101000110100011001011100111 18, 25 1001111000111100111110001111?????

? = taxa missing from an area.

Table 17 Matrix listing the geographical distribution of species 1±4, 8±11, 15±18 and 22±25 among areas A±D, with area A treated as two separate areas and areas B and D each treated as three separate areas, along with the binary codes representing the phylogenetic relationships among species 1±4, 8±11, 15±18 and 22±25 Area Taxa

Binary code

A1 A2 B1 B2 B3 C D1 D2 D3

100000110000011000001100000110001 ???????0001111??????????????????? 00101110010111001011???????01011 ???????0100011??????????????????? ?????????????????????0100011????? 010001101000110100011001011101011 0001111000111100011110001111????? 1000001?????????????????????????? ??????????????0010111????????????

1, 8, 15, 22, 29 11 3, 10, 17, 31 9 23 2, 9, 16, 24, 30 4, 11, 18, 25 1 17

? ˆ taxa missing from an area.

Figure 27 General area cladogram for areas A±D produced by primary BPA based on phylogenetic relationships of species 1±4, 8±11, 15±18, 22±26 and 31±33. Bold-faced numbers with asterisks represent phylogenetic relationships incongruent with the general area cladogram.

pattern by choosing the result which postulates the fewest number of area duplications, each one of which depicts a falsi®cation of the null hypothesis of simple vicariance. Secondary BPA thus also conforms to the dictum of selecting the working hypothesis which best ®ts all the data (the Principle of Logical Parsimony: Wiley, 1975, 1981; Eldredge & Cracraft, 1980; Brooks & McLennan, 1991; Wiley et al., 1991; Kluge, 1997, 1998a). The result is simultaneously an hypothesis of the history of the areas inhabited and of the speciation events for all members of all clades being analysed. This does not mean we will always achieve a single most parsimonious answer for secondary BPA. Consider a slight modi®cation of the previous example, in which species 25 in area B (Fig. 26) is not known (or does not exist). Primary BPA gives the same result as Fig. 27, but secondary BPA

Figure 28 General area cladogram for areas A±D produced by secondary BPA based on phylogenetic relationships of species 1±4, 8± 11, 15±18, 22±26 and 31±33. Note that representing area A twice and areas B and D three times produces an accurate depiction of the phylogenetic relationships for species 1±4, 8±11, 15±18, 22±26 and 31±33 as depicted in Fig. 26. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

How to do BPA, really 357

produces two equally parsimonious area cladograms, one in which area B is duplicated, suggesting that species 23 is the result of speciation by dispersal (as shown in Fig. 28), and the other in which area C is duplicated, suggesting that species 24 is the result of speciation by dispersal. ACKNOWLEDGMENTS This work was supported by operating grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada to DRB and DAM, by the Life Sciences Foundation (SLW), which is subsidized by the Netherlands Organization for Scienti®c Research (NWO); grant 80533.193 and by the Leids Universiteits Fonds to MvV. Many thanks to Marc Green, Eric Hoberg, Peter Hovenkamp, Greg Klassen, Diedel Kornet, Bruce Lieberman, Rick Mayden, Bob Murphy, Fredrik Ronquist, Hubert Turner, Ed Wiley, Rick Winterbottom, and Rino Zandee for thought-provoking discussions on conceptual and methodological issues. 5REFERENCES Brooks, D.R. (1981) Hennig's parasitological method: a proposed solution. Systematic Zoology, 30, 229±249. Brooks, D.R. (1985) Historical ecology: a new approach to studying the evolution of ecological associations. Annals of the Missouri Botanical Garden, 72, 660±680. Brooks, D.R. (1990) Parsimony analysis in historical biogeography and coevolution: methodological and theoretical update. Systematic Zoology, 39, 14±30. Brooks, D.R. & McLennan, D.A. (1991) Phylogeny, Ecology and Behavior. A Research Program in Comparative Biology. University of Chicago Press, Chicago. Brooks, D.R. & McLennan, D.A. (1999) Species: Turning a conundrum into a research program. Journal of Nematology, 31, 117±133. 6 Cracraft, J. (1983a) Species concepts and speciation analysis. Current Ornithology, Vol 1 (ed. F. Johnston), pp. 159±187. Plenum Press, New York. Cracraft, J. (1987) Species concepts and the ontology of evolution. Biology and Philosophy, 2, 329±346. Cracraft, J. (1989a) Speciation and its ontology: the empirical consequences of alternative species concepts for understanding patterns and processes of speciation. Speciation and its Consequences (eds D. Otte and J.A. Endler), pp. 28±59. Sinauer Assoc, Sunderland, MA. Cracraft, J. (1989b) Species as entities in biological theory. What the Philosophy of Biology Is ± Essays for David Hull (ed. M. Ruse), pp. 33±54. Kluwer Acad. Publishers, New York. Cressey, R.F., Collette, B. & Russo, J. (1983) Copepods and scombrid ®shes: a study in host-parasite relationships. Fisheries Bulletin, 81, 227±265. Donoghue, M.J. (1985) A critique of the Biological Species Concept, and recommendation for a phylogenetic alternative. Bryologist, 88, 172±181. Eldredge, N. & Cracraft, J. (1980) Phylogenetic Patterns and the Evolutionary Process. Columbia University Press, New York. Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358

Enghoff, H. (1996) Widespread taxa, sympatry, dispersal, and an algorithm for resolved area cladograms. Cladistics, 12, 349±364. Frost, D.R. & Kluge, A.G. (1994) A consideration of epistemology in systematic biology, with special reference to species. Cladistics, 10, 259±293. Gaffney, E. (1979) An introduction to the logic of phylogeny reconstruction. Phylogenetic Analysis, Paleontology. (eds J. Cracraft and N. Eldredge), pp. 79±111. Columbia University Press, New York. Ghiselin, M.T. (1974) A radical solution to the species problem. Systematic Zoology, 23, 536±544. Hennig, W. (1966) Phylogenetic systematics. University of Illinois Press, Urbana. Hovenkamp, P. (1997) Vicariance events, not areas, should be used in biogeographic analysis. Cladistics, 13, 67±79. Hull, D.L. (1976) Are species really individuals? Systematic Zoology, 25, 174±191. Hull, D.L. (1978) A matter of individuality. Philosophy of Science, 45, 335±360. Hull, D.L. (1980) Individuality and selection. Annual Review of Ecology and Systematics, 11, 311±332. Kluge, A.G. (1988) Parsimony in vicariance biogeography: a quantitative method and a Greater Antillean example. Systematic Zoology, 37, 315±328. 8 Kluge, A.G. (1989b) A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Systematic Zoology, 38, 7±26. Kluge, A.G. (1997) Testability and the refutation and corroboration of cladistic hypotheses. Cladistics, 13, 81±96. Kluge, A.G. (1998a) Total evidence or taxonomic congruence: cladistics or consensus classi®cation. Cladistics, 14, 151±158. Kluge, A.G. (1998b) Sophisticated falsi®cation and research cycles: consequences for differential character weighting in 9 phylogenetic systematics. Zoologica Scripta, 26, 349±360. LoÈther, R. (1990) Species and Monophyletic Taxa as Individual Substantial Systems. The Plant Diversity of Malesia (ed. P. Baas 10 et al.), pp. 371±378. Kluwer Academic Publishers, The Hague. Mayden, R.L. (1997) A hierarchy of species concepts: the denouement in the saga of the species problem. Species: the Units of Biodiversity (eds. M.F. Claridge, H.A. Dawah and M.R. Wilson), pp. 381±424. Chapman & Hall, London. Mayden, R.L. (1999) Consilience and a hierarchy of species concepts: advances toward closure on the species puzzle. Journal of Nematology, 31, 95±116. Mayden, R.L. & Wood, R.M. (1995) Systematics, species concepts, and the evolutionarily signi®cant unit in biodiversity and conservation biology. American Fisheries Society Symposium, 17, 58±113. McKitrick, M.C. & Zink, R.M. (1988) Species concepts in ornithology. Condor, 90, 1±14. Mishler, B.D. & Donoghue, M.J. (1982) Species concepts: a case for pluralism. Systematic Zoology, 31, 491±503. Morrone, J.J. & Carpenter, J.M. (1994) In search of a method for cladistic biogeography: an empirical comparison of component analysis, Brooks Parsimony analysis, and threearea statements. Cladistics, 10, 99±153. Morrone, J.J. & Crisci, J.V. (1995) Historical Biogeography: introduction to methods. Annual Review of Ecology and Systematics, 26, 373±401.

358 D. R. Brooks et al.

Nelson, G. & Ladiges, P.Y. (1991a) Standard assumptions for biogeographic analysis. Australian Systematic Botany, 4, 41±58. Nelson, G. & Ladiges, P.Y. (1991b) Three-area statements: standard assumptions for biogeographic analysis. Systematic Zoology, 40, 470±485. Nelson, G. & Ladiges, P.Y. (1996) Paralogy in cladistic biogeography and analysis of paralogy-free subtrees. American Museum Novitates, May 1996 (3167), 1±58. Nelson, G. & Platnick, N.I. (1981) Systematics and Biogeography. Cladistics and Vicariance Columbia University Press, New York. O'Grady, R.T. & Deets, G.B. (1987) Coding multistate characters, with special reference to the use of parasites as characters of their hosts. Systematic Zoology, 28, 1±21. O'Hara, R.J. (1994) Evolutionary history and the species problem. American Zoologist, 34, 12±22. Page, R.D.M. (1990) Component 1.5. Program and user's manual. University of Auckland, Auckland. Page, R.D.M. & Charleston, M.A. (1998) Trees within trees: phylogeny and historical associations. Trends in Ecology and Evolution, 13, 356±359. Platnick, N.I. & Gaffney, E.S. (1977) Systematics: a Popperian perspective. (reviews of `The Logic of Scienti®c Discovery' and `Conjectures and Refutations' by Karl R. Popper). Systematic Zoology, 26, 360±365. Platnick, N.I. & Gaffney, E.S. (1978a) Evolutionary Biology: a Popperian perspective. Systematic Zoology, 27, 137±141. Platnick, N.I. & Gaffney, E.S. (1978b) Systematics and the Popperian paradigm. Systematic Zoology, 27, 381±388. Popper, K.R. (1960) The Poverty of Historicism. Routledge and Kegan Paul, London. Popper, K.R. (1968a) The Logic of Scienti®c Discovery. Harper & Row, New York. Popper, K.R. (1968b) Conjectures and Refutations. Harper & Row, New York. Popper. K.R. (1972) Objective Knowledge. An Evolutionary Approach. Clarendon Press, Oxford. Popper, K.R. (1976) Unended Quest. An Intellectual Autobiography. Open Court Publishers Co., La Salle, IL. Popper, K.R. (1992) Realism and the Aim of Science. Routledge, London. Van Veller, M.G.P., Zandee, M. & Kornet, D.J. (1999) Two requirements for obtaining valid common patterns under different assumptions in vicariance biogeography. Cladistics, 15, 393±406. Van Veller, M.G.P. & Brooks (in press) When simplicity is not parsimonious: a priori and a posteriori approaches in histor11 ical biogeography. Journal of Biogeography. Wiley. E.O. (1975) Karl R. Popper, systematics, and classi®cation: a reply to Walter Bock and other evolutionary taxonomists. Systematic Zoology, 24, 233±242. Wiley, E.O. (1978) The evolutionary species concept revisited. Systematic Zoology, 27, 17±26. Wiley, E.O. (1980a) The metaphysics of individuality and its consequences for systematic biology. Brain and Behavior Sciences, 4, 302±303.

Wiley, E.O. (1980b) Is the evolutionary species concept ®ction? A consideration of classes, individuals, and historical entities. Systematic Zoology, 29, 76±80. Wiley, E.O. (1981) Phylogenetics: the Theory and Practice of Phylogenetic Systematics. Wiley-Intersci., New York. Wiley, E.O. (1986) Methods in vicariance biogeography. Systematics and Evolution: a Matter of Diversity (ed. P. Hovenkamp), pp. 283±306. University of Utrecht Press, Utrecht. Wiley, E.O. (1988a) Vicariance biogeography. Annual Review of Ecology and Systematics, 19, 513±542. Wiley, E.O. (1988b) Parsimony analysis and vicariance biogeography. Systematic Zoology, 37, 271±290. Wiley, E.O. (1989) Kinds, Individuals and Theories. What the Philosophy of Biology Is ± Essays for David Hull (ed. M. Ruse), pp. 289±300. Kluwer Academic Publishers, Dordrecht. Wiley, E.O. & Mayden, R.L. (1985) Species and speciation in phylogenetic systematics, with examples from the North American ®sh fauna. Annals of the Missouri Botanical Garden, 71, 596±635. Wiley, E.O., Siegel-Causey, D., Brooks, D.R. & Funk, V.A. (1991) The Compleat Cladist: A Primer of Phylogenetic Procedures. 1st edn. University of Kansas Mus. Nat Hist Press, Lawrence, Kansas. Wiley, E.O., Siegel-Causey, D., Brooks, D.R. & Funk, V.A. (in press) The Compleat Cladist: A Primer of Phylogenetic Procedures. 2nd edn. University of Kansas Mus. Nat Hist 12 Press, Lawrence, Kansas.

BIOSKETCHES Daniel Brooks is Professor of Zoology in the University of Toronto. For almost 30 years, he has studied the interplay between phylogeny and biogeography in the evolution of parasite±host communities, primarily in Latin America. Marco van Veller received his PhD from Leiden University in 2000, working on the theoretical foundations of methods of historical biogeographic analysis. He has recently received a permanent appointment with the Netherlands National Institute of Statistics. Deborah McLennan is Associate Professor of Zoology in the University of Toronto. Her primary research focus is experimental studies using phylogenetic information to help disentangle origins and maintenance in the evolution of animal communication systems, particularly courtship behaviour in ®shes. She is co-author, with Daniel Brooks, of Phylogeny, Ecology and Behaviour (University of Chicago Press, 1991) and The Nature of Diversity: An Evolutionary Voyage of Discovery (University of Chicago Press, in Press).

Ó Blackwell Science Ltd 2001, Journal of Biogeography, 28, 345±358