Intron loss mediated structural dynamics and functional differentiation ...

2 downloads 0 Views 1MB Size Report
Abstract. The polygalacturonase (PG) gene family is one of the largest gene families in plants. PGs are involved in various plant development steps.
Genes & Genomics (2010) 32: 570-577 DOI 10.1007/s13258-010-0076-8

RESEARCH ARTICLE

Intron loss mediated structural dynamics and functional differentiation of the polygalacturonase gene family in land plants Kyong-Cheul Park · Soon-Jae Kwon · Nam-Soo Kim 1)

Received: 23 June 2010 / Accept: 24 August 2010 / Published online: 31 December 2010 © The Genetics Society of Korea and Springer 2010

Abstract The polygalacturonase (PG) gene family is one of the largest gene families in plants. PGs are involved in various plant development steps. The evolutionary processes accounting for the functional divergence and the specialized functions of PGs in land plants are unclear. Whole sets of PG genes were retrieved from the genome web sites of model organisms in algae and land plants. The number of PG genes was expanded by lineage-specific manner with the biological complexity of the organism. Differentiation of PGs was related with phylogenetic hierarchy such as presence of rhamno-PGs from algae to plants, endo- and exo-PGs in land plants, exo-PGs in flowering plants. Gene structure analysis revealed that land plant PG genes resulted from differential intron gain and loss, with the latter event predominating. Differential intron losses partitioned the PGs into separate clades to be expressed differentially during plant development. Intron position and phase were not conserved between PGs of algae and land plants but conserved among PG genes of land plants from moss to vascular plants, indicating that the current introns in the PGs in land plants appeared after the split between unicellular algae and multicelluar land plants. The results demonstrate that the functional divergence and differentiation of PGs in land plants is attributable to intron losses.

Keywords Polygalacturonase; Gene family; Intron loss/gain; Land plants K. C. Park · N. S. Kim( ) Department of Molecular Biosciences, Institute of Biosciences and Biotechnology, Kangwon National University, Chuncheon 200-701, Korea e-mail: [email protected] S. J. Kwon US Department of Agriculture-Agricultural Research Service, Western Regional Plant Introduction Station, 59 Johnson Hall, Washington State University, Pullman WA 99164, USA

Introduction Gene families arose from a common ancestor by gene duplication. Purifying selection maintains functional redundancy of the duplicates, while accumulation of evolutionary neutral or loss-of-function mutations erodes the functional redundancy of paralogs (Prince and Pickett, 2002). Alternatively, the duplicated genes can retain the original function by duplication-degeneration-complementation cycle (Force et al., 1999). The duplicated genes are scattered throughout the genome via local, regional or global duplication (Prince and Pickett, 2002). If the duplication predated the divergence of phylogenetic lineages, expansion of the numbers of members from an ancient paralog is followed by the divergence of gene structure in a gene family (Boudet et al., 2001). Gene interruption by introns is a characteristic feature of eukaryotic genes. Introns are subject to relatively little selective pressure, resulting in rapid changes in the size and sequence of these structures. Nevertheless, a high conservation in intron and exon structures often exists, with intron positions and phase correspondence being noted in orthologs (Lecharny et al., 2003; Rogozin et al., 2003). Genomics studies have demonstrated the biological functions of the introns such as maximizing the coding capacity by alternate splicing (Ast, 2004), translation control by nonsense-mediated decay (NMD) (Jaillon et al., 2008), or creation of genetic diversity by recombination between intron sequences (Roy and Gilbert, 2006). The origins of the introns have not clearly defined with intense debates of “introns-early’ (Blake, 1978; Doolittle, 1978; Gilbert, 1987; Gilbert, 1978) and ‘introns-late’ (Cavalier-Smith, 1991; Sharp, 1991). Since neither theory can explain satisfactorily the diversity of modern genes having ancient and new introns, a new concept merging the two into a synthetic theory of evolution was proposed (de Souza, 2003). Since intron loss or gain is a rare event, comparison of the gene structures among gene families has been used to classify

Genes & Genomics (2010) 32:570-577

paralogs into subfamilies (Park et al., 2008). Divergence of paralogous gene pairs may have been prompted by retention of different cassette of exons via intron loss/gain under selective pressure (Babenko et al., 2004; Jeffares et al., 2006; Lecharny et al., 2003). PG (polygalacturonase) (EC 3.2.1.15) is a pectin digesting enzyme. Pectins, a class of complex polysaccharides, confer plant cell rigidity by cementing the cellulosic network. Since the rigid cell wall should be softened before cell expansion and division occur, PG is involved in various plant developmental steps such as fruit ripening, organ abscission, pollen grain maturation etc., (Hadfield and Bennett, 1998; Hadfield et al., 1998; Kalaitzis et al., 1997). The PG genes are a typical gene family in plants (Markovic and Janecek, 2001). Gene structure analysis revealed that PG gene structures within clades of the phylogenetic tree are highly preserved between monocotyledonous and dicotyledonous plants (Park et al., 2008). The availability of whole genome sequences in model species allowed retrieving nucleotide sequences of whole set of copies in gene families (Kong et al., 2007; Nicole et al., 2006; Park et al., 2008). Here, we investigated the phylogenetic relationships, tandem and segmental duplications, expression, and gene structure dynamics in the whole sets of PGs in Oryza sativa (rice), Arabidopsis thaliana (thale cress), Populus trichocarpa (poplar), Selaginella moellendorffii (spikemoss, lycophyte) Physcomitrella patens (moss, bryophyte), Chlamydomonas reinhardtii (green alga), Phaeodactylum tricornutum (diatom), and Aureococcus anaphagefferens (brown alga).

571

(http://www.ncbi.nlm.nih.gov) using the BLASTP program and the sequence of the glycosyl hydrolase family 28 domain as a query sequence. Phylogenetic analysis. Multiple alignment of the glycoside hydrolase family 28 domain sequences of the 225 PGs was performed using the MAFFT program (http://align.bmr.kyushu-u.ac.jp/mafft/online/server/), and gaps in the aligned sequences were edited using MEGA4 software (http://megasoftware.net). The phylogenetic tree was constructed with MAFFT using the neighbor joining-JTT method with 100 bootstrap repetitions. The phylogenetic tree was retrieved using TreeView Version 1.6.6 (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html) with MAFFT Tree raw data. Duplication and collinear gene block analyses. Genomic locations of the PG genes were analyzed from the websites of the individual species. Ten genes in the vicinity of the PG genes in the chromosomes or scaffold were compared in pair wise with other genes in the vicinity of other PG genes in the same phylogenetic clade for segmental duplication analysis. The information in the duplication database (http://www.plantgenome.uga.edu) was used for segmental duplication and gene order collinearity between genomes of rice, poplar, and Arabidopsis. Gene structure analysis.

Materials and Methods Isolation of polygalacturonase sequences. PG sequences were accessed from species-specific genome databases using either polygalacturonase or glycosyl hydrolase family 28 as a query. Databases were accessed at http://www.tigr.org/tdb/e2k1/osa1/GeneNameSearch.shtml for rice, http://genome.jgi-psf.org/Poptr1_1/Poptr1_1.home.html for poplar, http://genome.jgi-psf.org/Phypa1_1/Phypa1_1.home.html for moss, http://genome.jgi-psf.org/selmol/selmol.home.html for spikemoss, http://genome.jgi-psf.org/Auran1/Auran1.home. html for A. anophagefferens, http://genome.jgi-psf.org/Phatr2/ Phatr2.home.html for P. tricornutum, and http://genome.jgi-psf.org/Chlre3/Chlre3.home.html for C. reinhardtii. The sequences of PG genes from Arabidopsis were from Kim et al. (2006). PG gene sequences from Cryptomeria japonica (cedar) and Erwinia carotovora were identified from the NCBI

GeneView (http://www.gramene.org/, http://genome.jgi-psf.org) was used to retrieve PG nucleotide sequences and to identify introns and exons. Gene structures of PGs in the same clade were phylogenetically analyzed. Consensus functional domain sequences in the PGs of each clade were identified using CLUSTAL W (http://www.ebi.ac.uk/clustalw/), and the results were viewed and edited using T-View (http://www.ebi.ac.uk/t-coffee/help.html). Intron phases were manually analyzed from exon information. For intron position analysis, the protein sequences of each clade were analyzed by multiple sequence alignment using CLUSTAL W (http://www.ebi.ac.uk/clustalw/), and the intron locations were manually identified. Intron phases were analyzed manually from the exon information; phase 0 designated introns between exons, phase 1 designated introns between first and second nucleotides in a codon, and phase 2 designated introns between second and third nucleotides in a codon.

572

Genes & Genomics (2010) 32:570-577

Results and Discussion

had undergone functional divergence.

Phylogenetic analysis of the PG genes.

Tandem and segmental duplications of the PG genes.

The number of PG genes in each species was increased with the biological complexity of the organism such as from one to 4 in single celled algae to 44 to 68 copies in flowering plants. The PGs were grouped into six clades (clades A-F), based on their amino acid sequence similarity (Fig. 1). However, a sub-clade consisted of 4 PGs of spikemoss and 1 PG of moss was out-grouped from the clades A and B. While clade E contained PGs from algae to flowering plants, the clades A and B contained PGs of the all land plants. The PGs in clades C, D, and F were consisted only with the ones of flowering plants (Fig. 2). PGs were classified as endo-PGs, exo-PGs, and rhamono-PGs as previously described (Hadfield and Bennett, 1998; Markovic and Janecek, 2001). In our analysis, the PGs were grouped as endo-PGs in clades A and B, exo-PGs in clades C and D, and rhamno-PGs in clade E. All algal PGs and 8 of the 11 moss PGs were assigned to clade E. One of the remaining moss PGs was assigned to clade A, while the other was assigned to clade B. One moss PG (PPP116593) was out-grouped from clades A-D and F, but not clade E, indicating that endo- and exo-PGs appeared after multicellular land plants split from unicellular algae about 400 million years ago (Mya) (Rensing et al., 2008; Willis and McElwain, 2002). Clades C, D, and F did not contain moss PGs, which implies that the PGs (exo-PGs) in these clades expanded after divergence of vascular plants from non-vascular Bryophytes (e.g., moss), which was dated about 200 Mya (Willis and McElwain, 2002). Clade F members could not be clearly defined as either endoor exo-PGs. The six clades were further classified into 20 sub-clades (Figure 1). Six of these sub-clades consisted of either eudicots (C-I, C-II, D-II, D-III, ) or monocots (C-III, D-I). Thus, these PGs might have resulted from lineage-specific expansion after the split of monocotyledonous plants from their dicotyledonous counterparts about 140-150 Mya (Chaw et al., 2004; Lespinet et al., 2002). Expression of the PGs was surveyed by analyzing expressed sequence tags (ESTs) in NCBI, TIGR, or species-specific genome databases (Table S1). The PGs exhibited highly redundant expression in the same tissues as shown in previous studies (Kim et al., 2006; Torki et al., 2000), and multiple sequence identity between ESTs of different tissues, in agreement with RT-PCR analysis in A. thaliana (Kim et al., 2006). Among plant PGs, the rhamno-PGs in clade E were the most widely expressed, being matched with ESTs in multiple tissues. In accord with previous findings (Hadfield and Bennett, 1998; Kim et al., 2006), PGs in clade C were prominently expressed in floral organs, intimating that they

No tandem duplicated PG genes were present in algae and moss, but nine of the moss PG genes were segmentally duplicated (Fig. S1). In PGs of flowering plants, the percent of tandem or segmental duplication varied from 45.5% in rice and 46.2% in poplar to 57% in A. thaliana. This rate of segmental duplication and tandem duplication suggests that gene family expansion occurs via WGD (Horan et al., 2005; Kim et al., 2006; Nicole et al., 2006). WGD and subsequent tandem, local, and regional duplications can distribute gene family members throughout the genome (Prince and Pickett, 2002; Soltis, 2005). WGD has occurred at least once in rice (Paterson et al., 2004) and three times in Arabidopsis and poplar (Bowers et al., 2003; Tuskan et al., 2006). In Arabidopsis, many PG genes were derived from large-scale duplication and large numbers of these genes seemed to be lost by gene death (Kim et al., 2006; Kong et al., 2007). Homologous segmental duplication blocks containing PGs were identified using the plant genome duplication database (http://chibba.agtec.uga.edu/duplication/) for rice, A. thaliana, and poplar (Fig. S2). For moss, the pair wise homology of the PG-flanking genes with those in vascular plants was determined using the CLUSTAL W program. The number of intergenomic segmentally duplicated PG genes shared between the two dicot plants (i.e., A. thaliana vs. poplar) was almost 3-fold greater than the number of intergenomic segmentally duplicated PG genes shared between dicots and monots (i.e., poplar vs. rice or A. thaliana vs. rice), inferring that multiple WGDs and subsequent gene loss occurred after the split between A. thaliana and poplar. Five homologous collinear gene blocks were found among A. thaliana, poplar, and rice. While most of the intergenomic segmentally duplicated PG genes were in the same phylogenetic clade, 11 sets of them were differentiated into separate clades, implying that structural and sequence divergence of the PGs occurred after speciated. The genes in the sub-clade E-I are interesting by having almost identical gene structures and sharing gene orders surrounding regions of the PGs among moss, Arabidopsis, poplar, and rice (Fig. 3). However, gene structure conservation and gene order collinearity were not extended to the C. reinhardtii PG, Chlr73470. One PG from Aspergillus oryzae was previously coupled with plant PGs in this sub-clade with a high bootstrap value, but showed a different gene structure (Park et al., 2008). Therefore, the gene order collinearity and PG gene structures among the sub-clade E-I in land plants must have been conserved since their first appearance about 400 Mya (Willis and McElwain, 2002).

Genes & Genomics (2010) 32:570-577

573

Figure 1. A phylogenetic tree of the PGs of algae and plants. The PGs in each species were designated as AA for A. anaphagefferens (brown algae, brown), PT for P. tricornutum (diatom, brown), Chrl for C. reinhardtii (green algae, black), PPP for P. patens (moss, bryophyte, red), SMO for S. moellendorfii (spikemoss, lycophyte, red), POP for P. trichocarpa (poplar, blue), AT for A. thaliana (blue), and OS for O. sativa (rice, green), respectively. The Ecpeh1 and BAA06172 were the PGs from E. carotovora (bacteria) and C. japonica (cedar), respectively. The numbers on the nodes are bootstrap values and the values lower than 50 are not shown.

574

Genes & Genomics (2010) 32:570-577 Table 1. Overalls of the gene structures of the land plant PGs No. of Genes No. of Introns Aver. no. intron per genes Intron loss Intron gain Exon loss Exon gain No. of homologous gene blocks No. of novel introns

Figure 2. A phylogenetic tree of the PGs and the presence of PG genes in algae and major groups in land plants. The + and – designate presence or absence of the PG genes, respectively. PGs in clade F were not determined whether they are endo‐ or exo‐PGs.

Intron position correspondence of the PG genes. Intron position and phase correspondence were analyzed among the PG genes within and between clades. Seventeen homologous intron blocks were found among the PG genes of land plants (Table 1 and Table S2). The PG genes of alga species did not show intron position correspondence with PG genes of land plants. Among the 17 homologous intron blocks, 15 were present in all land plants, with differential intron losses resulting in the current PG genes. One intron block gain

226 952 4.21 64~65 11 9.5 4 17 13

(homologous intron block 1) occurred after divergence of the flowering plants from the non-flowering plants. One intron block loss (homologus intron block 9) occurred after divergence of vascular plants from nonvascular plants. Of the 17 homologous intron blocks, 15 were present within glycosyl hydrolase 28 domain. In land plants, PG genes contained 13 novel introns, likely from intron gain. No intron phase preference was found in these novel introns (Table S3). The novel introns were more frequently present outside of the domain motifs. BLAST analysis revealed that 8 of the 13 novel introns contained MITE (Miniature Inverted-repeat Transposable Elements). However, MITE-driven intron insertion seemed improbable since MITEs are short and do not have a protein-coding function (Roy, 2004; Wessler et al., 1995). Remarkably high intron position conservation was observed among plant,

Figure 3. PG gene structure and collinearity in selected organisms. (A) Structures of PG genes in sub-clade E-I and (B) gene order collinearity in rice, poplar, Arabidopsis, and moss. Orthologous genes are shown in the same color, and levels of sequence identity between genes in the collinear block are indicated with colored lines. The genomic positions of the first and last nucleotides of the genes are shown in (B).

Genes & Genomics (2010) 32:570-577

575

Figure 4. Gene structure dynamics PG genes in land plants. Endo- and exo-PGs are shown in (A), while rhamo-PGs are shown in (B, C). Clade designations are indicated by the color of the gene name, in accordance with the color classification in Fig. 1. Number of nucleotides is shown on the exons (thick bars), and intron phases are shown on the introns (thin lines).

animal, and fungal species over 1.5 billion years (Rogozin et al., 2003). However, our analysis did not reveal any conservation of intron position or phase between algal PG genes or between algal and land plant PG genes. Intron positions were not conserved among the three PG genes found in P. tricornutum in our study, in contrast to the intron position conservation seen among the enslaved nucleomorph-containing algae Bigelowiella natans, C. reinhardtii, and A. thaliana (Gilson et al., 2006). Algae and bryophytes appeared approximately 1000 and 400 Mya, respectively (Merchant et al., 2007; Rensing et al., 2008). The lack of correspondence in intron position between algae as well as between algae and land plants, taken with the high conservation of intron position among land plants, implies that introns in current PG genes of land plants appeared after multicellular plants split from

unicellular algae approximately 400 Mya. Gene structure dynamics and evolution of the PG genes. During evolution, gain and loss of introns from the three primary gene structures likely generated the current set of land plant PG genes (Fig. 4). Sequential and differential intron losses from a PG gene with a structure similar to PPP133517 produced all exo- and endo-PG genes (Figure 4-A). In vascular plants, rhamno-PG genes are derived from two basic gene structures, PPP43415 or PPP163808 (Fig. 4-B). In contrast, rhamno-PG genes in sub-clade E-I did not undergo changes in gene structure during the evolution of land plants (Fig. 4-C). Based on the present PG gene structures (Fig. S3), scenarios for intron gain/loss were deduced (Figure S4). While intron

576

losses occurred 64 – 65 times, intron gains were as less as 11 times in the PG genes during the land plant evolution. Exon gains and losses were 4 and 9.5 times, respectively (Fig. S4). A summary of PG gene structures in land plants (i.e., moss, A. thaliana, poplar, and rice) is provided in Table 1. Since the intron density among different eukaryotic taxa varies more than three orders magnitude (Jeffares et al., 2006; Roy and Gilbert, 2006), frequency of intron loss and gain during evolution is a subject of debate (Babenko et al., 2004; Knowles and McLysaght, 2006; Lin et al., 2006; Roy and Gilbert, 2006). An extensive study of more than 8,000 orthologs in A. thaliana and O. sativa revealed that intron losses were 12.6 and 9.8 times greater than intron gains in A. thaliana and O. sativa, respectively (Roy and Penny, 2007). Our analysis revealed that intron and exon structures in the PG genes are highly conserved among land plants, but differed to those of algae. Concluding remarks PGs are involved in plant development by cell wall modification. The PG gene expansion and differentiation were evidently followed by the pattern of species complexity such as presence of rhamno-PGs from algae to plants, endo- and exo-PGs in land plants, and exo-PGs in flowering plants. The intron and exon structures in the PG genes are highly conserved among land plants, but differed to those of algae. Differential intron losses divided PG genes into separate clades to be expressed differentially during plant development, which were congruent with phylogenetic classification based on the amino acid sequence similarity. Intron losses dominated intron gains in shaping current PG genes in land plants. Therefore, intron gain and loss may be an important genomic adaptation, resulting in genome-specific intron structures during evolution. Acknowledgements This study was funded by a fellowship grant to K.C.P from the second stage of the BK21 program from the Ministry of Education of Korea.

Reference Ast G (2004) How did alternative splicing evolve? Nat. Rev. Genet. 5(10): 773-782. Babenko VN, Rogozin IB, Mekhedov SL and Koonin EV (2004) Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res. 32(12): 3724-3733. Blake CCF (1978) Do genes-in-pieces imply proteins-in-piece? Nature 273: 267 - 267. Boudet N, Aubourg S, Toffano-Nioche C, Kreis M and Lecharny A (2001) Evolution of intron/exon structure of DEAD helicase fam-

Genes & Genomics (2010) 32:570-577 ily genes in Arabidopsis, Caenorhabditis, and Drosophila. Genome Res. 11(12): 2101-2114. Bowers JE, Chapman BA, Rong J and Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422(6930): 433-438. Cavalier-Smith T (1991) Intron phylogeny: a new hypothesis. Trends Genet. 7(5): 145-148. Chaw SM, Chang CC, Chen HL and Li WH (2004) Dating the monocot-dicot divergence and the origin of core eudicots using whole chloroplast genomes. J. Mol. Evol. 58(4): 424-441. de Souza SJ (2003) The emergence of a synthetic theory of intron evolution. Genetica 118(2-3): 117-121. Doolittle WF (1978) Genes in pieces: were they ever together? Nature 272: 581 - 582. Force A, Lynch M, Pickett FB, Amores A, Yan YL and Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151(4): 1531-1545. Gilbert W (1987) The exon theory of genes. Cold Spring Harb. Symp. Quant. Biol. 52: 901-905. Gilbert W (1978) Why genes in pieces? Nature 271(5645): 501. Gilson PR, Su V, Slamovits CH, Reith ME, Keeling PJ and McFadden GI (2006) Complete nucleotide sequence of the chlorarachniophyte nucleomorph: nature's smallest nucleus. Proc. Natl. Acad. Sci. USA 103(25): 9566-9571. Hadfield KA and Bennett AB (1998) Polygalacturonases: many genes in search of a function. Plant Physiol. 117(2): 337-343. Hadfield KA, Rose JK, Yaver DS, Berka RM and Bennett AB (1998) Polygalacturonase gene expression in ripe melon fruit supports a role for polygalacturonase in ripening-associated pectin disassembly. Plant Physiol. 117(2): 363-373. Horan K, Lauricha J, Bailey-Serres J, Raikhel N and Girke T (2005) Genome cluster database. A sequence family analysis platform for Arabidopsis and rice. Plant Physiol. 138(1): 47-54. Jaillon O, Bouhouche K, Gout JF, Aury JM, Noel B, Saudemont B, Nowacki M, Serrano V, Porcel BM, Segurens B et al. (2008) Translational control of intron splicing in eukaryotes. Nature 451(7176): 359-362. Jeffares DC, Mourier T and Penny D (2006) The biology of intron gain and loss. Trends Genet. 22(1): 16-22. Kalaitzis P, Solomos T and Tucker ML (1997) Three different polygalacturonases are expressed in tomato leaf and flower abscission, each with a different temporal expression pattern. Plant Physiol. 113(4): 1303-1308. Kim J, Shiu SH, Thoma S, Li WH and Patterson SE (2006) Patterns of expansion and expression divergence in the plant polygalacturonase gene family. Genome Biol. 7(9): R87. Knowles DG and McLysaght A (2006) High rate of recent intron gain and loss in simultaneously duplicated Arabidopsis genes. Mol. Biol. Evol. 23(8): 1548-1557. Kong H, Landherr LL, Frohlich MW, Leebens-Mack J, Ma H and dePamphilis CW (2007) Patterns of gene duplication in the plant SKP1 gene family in angiosperms: evidence for multiple mechanisms of rapid gene birth. Plant J. 50(5): 873-885. Lecharny A, Boudet N, Gy I, Aubourg S and Kreis M (2003) Introns in, introns out in plant gene families: a genomic approach of the dynamics of gene structure. J. Struct. Funct. Genomics 3(1-4): 111-116. Lespinet O, Wolf YI, Koonin EV and Aravind L (2002) The role of lineage-specific gene family expansion in the evolution of

Genes & Genomics (2010) 32:570-577 eukaryotes. Genome Res. 12(7): 1048-1059. Lin H, Zhu W, Silva JC, Gu X and Buell CR (2006) Intron gain and loss in segmentally duplicated genes in rice. Genome Biol. 7(5): R41. Markovic O and Janecek S (2001) Pectin degrading glycoside hydrolases of family 28: sequence-structural features, specificities and evolution. Protein Eng. 14(9): 615-631. Merchant SS, Prochnik SE, Vallon O, Harris EH, Karpowicz SJ, Witman GB, Terry A, Salamov A, Fritz-Laylin LK, Marechal-Drouard L et al. (2007) The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318(5848): 245-250. Nicole MC, Hamel LP, Morency MJ, Beaudoin N, Ellis BE and Seguin A (2006) MAP-ping genomic organization and organ-specific expression profiles of poplar MAP kinases and MAP kinase kinases. BMC Genomics 7: 223. Park KC, Kwon SJ, Kim PH, Bureau T and Kim NS (2008) Gene structure dynamics and divergence of the polygalacturonase gene family of plants and fungus. Genome 51(1): 30-40. Paterson AH, Bowers JE and Chapman BA (2004) Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. USA 101(26): 9903-9908. Prince VE and Pickett FB (2002) Splitting pairs: the diverging fates of duplicated genes. Nat Rev. Genet. 3(11): 827-837. Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A, Shapiro H, Nishiyama T, Perroud PF, Lindquist EA, Kamisugi Y et al. (2008) The Physcomitrella genome reveals evolutionary insights

577 into the conquest of land by plants. Science 319(5859): 64-69. Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG and Koonin EV (2003) Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr. Biol. 13(17): 1512-1517. Roy SW (2004) The origin of recent introns: transposons? Genome Biol. 5(12): 251. Roy SW and Gilbert W (2006) The evolution of spliceosomal introns: patterns, puzzles and progress. Nat. Rev. Genet. 7(3): 211-221. Roy SW and Penny D (2007) Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol. Biol. Evol. 24(1): 171-181. Sharp PA (1991) "Five easy pieces". Science 254(5032): 663. Soltis PS (2005) Ancient and recent polyploidy in angiosperms. New Phytol. 166(1): 5-8. Torki M, Mandaron P, Mache R and Falconet D (2000) Characterization of a ubiquitous expressed gene family encoding polygalacturonase in Arabidopsis thaliana. Gene 242(1-2): 427-436. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A et al. (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313(5793): 1596-1604. Wessler SR, Bureau TE and White SE (1995) LTR-retrotransposons and MITEs: important players in the evolution of plant genomes. Curr. Opin. Genet. Dev. 5(6): 814-821. Willis KJ and McElwain JC (2002) The Evolution of Plants. Oxford University Press.