Epistatic Interactions between Opaque2 ... - Plant Physiology

8 downloads 98 Views 657KB Size Report
Plantes, F–91405 Orsay, France (J.-L.P.); and CNRS, UMR 0320/UMR 8120 Génétique ...... Marchini J, Donnelly P, Cardon LR (2005) Genome-wide strategies for ... Scanlon MJ, Stinard PS, James MG, Myers AM, Robertson DS (1994).
Epistatic Interactions between Opaque2 Transcriptional Activator and Its Target Gene CyPPDK1 Control Kernel Trait Variation in Maize1[C][W][OA] Domenica Manicacci*, Letizia Camus-Kulandaivelu, Marie Fourmann, Chantal Arar, Ste´phanie Barrault, Agne`s Rousselet, Noe¨l Feminias, Luciano Consoli, Lisa France`s, Vale´rie Me´chin, Alain Murigneux, Jean-Louis Prioul, Alain Charcosset, and Catherine Damerval University Paris-Sud, UMR 0320/UMR 8120 Ge´ne´tique Ve´ge´tale, F–91190 Gif sur Yvette, France (D.M.); INRA, UMR 0320/UMR 8120 Ge´ne´tique Ve´ge´tale, F–91190 Gif sur Yvette, France (L.C.-K., S.B., A.R., A.C.); Groupe Biogemma, ZI du Bre´zet, F–63028 Clermont-Ferrand cedex 2, France (M.F., C.A., L.C., L.F., A.M.); Groupe Limagrain, Domaine de Mons, F–63200 Aubiat, France (N.F.); INRA, UMR 0206 Chimie Biologique, F–78850 Thiverval-Grignon, France (V.M.); University Paris-Sud, UMR 8618 Institut de Biotechnologie des Plantes, F–91405 Orsay, France (J.-L.P.); and CNRS, UMR 0320/UMR 8120 Ge´ne´tique Ve´ge´tale, F–91190 Gif sur Yvette, France (C.D.)

Association genetics is a powerful method to track gene polymorphisms responsible for phenotypic variation, since it takes advantage of existing collections and historical recombination to study the correlation between large genetic diversity and phenotypic variation. We used a collection of 375 maize (Zea mays ssp. mays) inbred lines representative of tropical, American, and European diversity, previously characterized for genome-wide neutral markers and population structure, to investigate the roles of two functionally related candidate genes, Opaque2 and CyPPDK1, on kernel quality traits. Opaque2 encodes a basic leucine zipper transcriptional activator specifically expressed during endosperm development that controls the transcription of many target genes, including CyPPDK1, which encodes a cytosolic pyruvate orthophosphate dikinase. Using statistical models that correct for population structure and individual kinship, Opaque2 polymorphism was found to be strongly associated with variation of the essential amino acid lysine. This effect could be due to the direct role of Opaque2 on either zein transcription, zeins being major storage proteins devoid of lysine, or lysine degradation through the activation of lysine ketoglutarate reductase. Moreover, we found that a polymorphism in the Opaque2 coding sequence and several polymorphisms in the CyPPDK1 promoter nonadditively interact to modify both lysine content and the protein-versus-starch balance, thus revealing the role in quantitative variation in plants of epistatic interactions between a transcriptional activator and one of its target genes.

A major concern in molecular population and evolutionary genetics is the dissection of the genetic basis of natural variation of complex traits involved in fitness, adaptation to local environments, and evolvability. Similarly, plant breeders are interested in screening large collections of genetic resources in order to identify haplotypes of interest involved in agronomic trait variation (i.e. chromosome regions, genes, or even causative polymorphisms that could be used in crop genetic improvement programs). During 1

This work was supported by Genoplante programs. * Corresponding author; e-mail [email protected]. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Domenica Manicacci ([email protected]). [C] Some figures in this article are displayed in color online but in black and white in the print edition. [W] The online version of this article contains Web-only data. [OA] Open Access articles can be viewed online without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.108.131888 506

the last decade, a great amount of effort has been devoted to improving molecular marker genotyping and statistical analyses in order to provide both fundamental and applied researchers with efficient tools to address their common interest in genes and networks underlying complex traits. Since the origin of agriculture, human population growth has generated an increasing demand on plant production. Cereals represent a major part of human and cattle diet in terms of starch and proteins, although they provide unbalanced protein intake, since their most abundant proteins are generally poor in essential amino acids (Young et al., 1998). Maize (Zea mays), one of the main crops in the world, accumulates a large amount of starch (75%280% of endosperm dry matter at maturity) and storage proteins (12%215%, mainly zeins) in its kernel endosperm. Such resources are accumulated to support seed germination and initial seedling growth, which are major traits in both agronomical and natural conditions. Many kernel mutants are available in maize, indicating that endosperm development is driven by the coordinated expression of several hundred genes (Neuffer and

Plant Physiology, May 2009, Vol. 150, pp. 506–520, www.plantphysiol.org Ó 2009 American Society of Plant Biologists

Epistatic Control of Maize Kernel Quality

Sheridan, 1980; Scanlon et al., 1994; Verza et al., 2005; Liu et al., 2008). Among nonmutant maize, phenotypic variation has largely been observed for traits such as yield, endosperm content in lipids, proteins, and starch, and protein quality. So far, few genes and alleles that control such natural variation among cultivated maize have been identified. Physiological characterization of maize kernel development showed that starch accumulates from 12 to 35 d after pollination, while accumulation of storage proteins, mainly zeins, begins at 10 to 15 d after pollination and continues until maturity. Recently, the first proteomic analysis of maize kernel development (Me´chin et al., 2007) led to the hypothesis that pyruvate orthophosphate dikinase (PPDK) plays a crucial role in protein-versus-starch balance in the kernel. In C4 plants such as maize, PPDK is abundant in leaves and has been shown to play a major role in photosynthesis (Burnell and Hatch, 1986). High PPDK activity has also been detected in maize endosperm as well as in non-C4 cereal seeds, where its function has not yet been clearly established. In nonphotosynthetic organs, PPDK catalyzes the reversible conversion of pyruvate, inorganic phosphate, and ATP into phosphoenolpyruvate (PEP), inorganic pyrophosphate (PPi), and AMP. Me´chin et al. (2007) observed that this protein drastically increases in quantity during grain development from 21 d after pollination onward (i.e. at the time when starch accumulation is slowing but protein accumulation is still ongoing). In the endosperm cytosol, PPDK may be involved both in amino acid synthesis through PEP, a precursor of aromatic amino acids, and in starch metabolism through PPi, which inhibits ADP-Glc synthesis, thus limiting the first step of starch synthesis. An increase in PPDK activity in the endosperm cytosol could thus induce a reduction in starch accumulation as well as an increase in protein content. In the maize kernel, the great majority of PPDK enzyme is cytosolic and encoded by a specific gene (hereafter designated CyPPDK1; Sheen, 1991) whose transcription has been shown to be up-regulated by Opaque2 (Gallusci et al., 1996) through binding to specific sequences in the CyPPDK1 promoter (Maddaloni et al., 1996). Opaque2, which encodes a basic Leu zipper transcriptional activator (Hartings et al., 1989; Schmidt et al., 1990), is specifically expressed during endosperm development (Gallusci et al., 1994) and controls the transcription of many target genes, including the 14- and 22-kD zein genes, zeins constituting the most abundant storage proteins in the kernel (Schmidt et al., 1992; Cord Neto et al., 1995), the Lys-ketoglutarate reductase/saccharopine dehydrogenase that catalyzes Lys catabolism (Kemper et al., 1999), and CyPPDK1. The opaque2 mutant possesses a soft, starchy endosperm and a significant increase in Lys and Trp content, these essential amino acids being absent from zeins (Landry et al., 2002). As a result, Opaque2 may play a complex role in the accumulation of storage compounds during kernel development Plant Physiol. Vol. 150, 2009

through direct (zein synthesis activation, Lys catabolism) and indirect (CyPPDK1 activation) effects. While physiological studies are aimed at understanding the role of Opaque2 and CyPPDK1 maize mutations, nothing is known about the role of nonmutant allele diversity in maize phenotypic variation. To date, molecular approaches in genetically dissecting kernel quality traits in maize have focused on quantitative trait locus (QTL) mapping. Some of these studies highlighted the chromosome bin 6.05, where CyPPDK1 is located (Veldboom and Lee, 1994; Austin and Lee, 1998; Melchinger et al., 1998; Hirel et al., 2001; Ho et al., 2002; Blanc et al., 2006). By contrast, we found no report of QTLs for yield, kernel number, or weight on chromosome bin 7:01 that encompasses the Opaque2 gene. Very few QTL studies addressed phenotypic traits less complex than yield, such as kernel quality or composition, likely because these traits are more difficult to characterize in large populations. QTLs for total protein content have been reported on chromosome bins 6.05 (Melchinger et al., 1998) and 7.02 (C. Damerval, personal communication), and QTLs for carbohydrate content have been found by Thevenot et al. (2005) on bins 6.05 (Suc content) and 7.01 (Fru and Glc contents). In a partial diallele design, Lou et al. (2005) found a strong effect of Opaque2 allelic variation on Lys and protein contents among F1 hybrids derived from 15 parents. In the closely related species Sorghum bicolor, Rami et al. (1998) reported QTLs for amylose content, a component of starch, and kernel hardness in a region syntenic to the Opaque2 region, but no QTL was found close to CyPPDK1. Overall, these physiological and genetic studies make both CyPPDK1 and Opaque2 interesting candidate genes for kernel phenotypic variation. Indeed, whether their natural sequence variation among diverse maize lines affects kernel quality traits remains an open question. The limits of QTL mapping approaches in identifying genes involved in natural phenotypic variation are 2-fold. First, confidence intervals of QTL location are very large, usually encompassing thousands of genes, because of the accumulation of few recombination events during the production of QTL mapping populations. Second, genetic diversity of these populations is low, since they derive from a limited number of parents. Recent decreases in sequencing and genotyping costs, as well as theoretical and statistical developments, led to the emergence of association genetics as an alternative method to identify gene polymorphisms responsible for phenotypic variation. Association genetics takes advantage of existing collections and historical recombination to study the correlation between large genetic diversity and phenotypic variation. Since the pioneering work of Thornsberry et al. (2001), this method has received increasing interest from plant geneticists (Flint-Garcia et al., 2003, 2005; Gupta et al., 2005; Yu and Buckler, 2006) and has proved powerful in associating allelic variation of candidate genes to flowering time in various species such as Arabidopsis (Arabidopsis thaliana; Hagenblad et al., 2004; Olsen et al., 2004; Aranzana et al., 2005), 507

Manicacci et al.

Brassica nigra (Osterberg et al., 2002), pine (Pinus taeda; Gonzalez-Martinez et al., 2007), wheat (Triticum aestivum; Crossa et al., 2007), and maize (Thornsberry et al., 2001; Andersen et al., 2005; Camus-Kulandaivelu et al., 2006; Salvi et al., 2007; Ducrocq et al., 2008). In addition in maize, polymorphisms from genes involved in biosynthetic pathways were found to be associated with key agronomic traits such as digestibility (GuilletClaude et al., 2004a, 2004b), kernel composition (Wilson et al., 2004; Harjes et al., 2008), and phenylpropanoid and flavonoid contents (Szalma et al., 2005). A major concern in association studies is that genetic structure within the sample can generate linkage disequilibrium (LD) between genetically unlinked loci, leading to false-positive associations. Such false-positive associations could be limited by considering both population structure and relatedness among individuals in a mixed-model analysis (Yu et al., 2006). Maize is a particularly suitable species for association genetic studies, since it contains a substantial amount of genetic diversity, local LD rapidly decreases with physical distance, remaining low at about 2 kb on average for a diverse population (Remington et al., 2001), and population structure may easily be characterized through neutral markers (Thornsberry et al., 2001; Liu et al., 2003). The aim of this study was to test for associations between CyPPDK1 and Opaque2 gene poly-

morphisms and kernel quality traits. We first characterized the molecular diversity of both candidate genes among a large collection representative of cultivated maize. Second, we characterized this plant collection for kernel traits such as kernel, endosperm, and embryo weights, vitreousness, and kernel composition in starch, lipids, total proteins, individual amino acids, and soluble carbohydrates (Table I), taking advantage of the high-throughput method of near infrared reflectance spectroscopy (NIRS). The maize collection we used was previously genotyped for neutral simple strand repeat (SSR) markers in order to evaluate population structure (Camus-Kulandaivelu et al., 2006). We additionally estimated pairwise kinship coefficients between inbred lines from the same set of markers. We finally tested for associations between these phenotypic traits and either individual gene polymorphisms or combined polymorphisms from CyPPDK1 and Opaque2 in order to assess possible epistatic interaction between the transcriptional activator and its target gene. RESULTS Variation in Kernel Traits

Kernel quality traits (Table I) were highly variable among inbred lines and showed very high heritability

Table I. Phenotypic traits measured in the collection of 375 maize inbred lines Abbreviation FT TKW KDM ASH STAR PROT LIP SGL WALL AMAMI SATUR SACOSE GLCFRU THREO METHI ISOLEU LEU PHENYL LYSIN SAA KW EW EMB END AVITRO VITRO ASP P/S P/L L/S PCA1

508

to

PCA8

Method

Definition of the Trait

Direct Direct NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS Calculated Calculated Calculated Calculated PCA

Flowering time for male inflorescence, in degree-days Kernel weight (based on 1,000 grains) Percentage of kernel dry matter Ashes, in percentage of dry matter Starch weight, in percentage of dry matter Total protein weight, in percentage of dry matter Total lipid weight, in percentage of dry matter Soluble carbohydrate, in percentage of dry matter Wall, in percentage of dry matter Amylose-to-amylopectine weight ratio Saturated fatty acid, in percentage of dry matter Suc to total sugars, in percentage of dry matter Glc 1 Fru to kernel, in percentage of dry matter Thr to kernel, in percentage of dry matter Met to kernel, in percentage of dry matter Ile to kernel, in percentage of dry matter Leu to kernel, in percentage of dry matter Phe to kernel, in percentage of dry matter Lys to kernel, in percentage of dry matter Total soluble amino acids to kernel, in percentage of dry matter Kernel weight Endosperm weight Embryo to kernel, in percentage of dry matter Endosperm to kernel, in percentage of dry matter Vitreous endosperm to kernel, in percentage of dry matter Vitreous endosperm to total endosperm, in percentage of dry matter Asp-derived amino acids (THREO 1 METHI 1 ISOLEU 1 LYSIN) Protein to starch (PROT/STAR) Protein to lipids (PROT/LIP) Lipids to starch (LIP/STAR) Coordinates on the eight first PCA axes from NIRS traits Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

The third PCA axis (r2 5 14.1%) is positively correlated to embryo weight (EMB) and lipid content (LIP, SATUR) and negatively correlated to endosperm weight (EW). Most phenotypic traits show a significant variation among groups defined by Structure software (see r2 group in Table II). Many kernel traits (i.e. TKW, ASH, SGL, AMAMI, SATUR, SACOSE, GLCFRU, KW, EW EMB, END, AVITRO VITRO, PCA1, PCA3, PCA5, PCA6, and L/S) show a strong variation among groups (P , 0.0001), population structure accounting for 5.51% for L/S to more than 20% of their variance for SATUR, KW, VITRO, and PCA3. Other phenotypic traits are either less variable among groups (such as PROT, LIP, STAR, THREO, METHI, LYSIN, ASP, PCA2, PCA4, 2 PCA8, P/S; r group varying from 1.66% to 4.98%) or show no significant structure (KDM, WALL, ISOLEU, LEU, PHENYL, SAA, PCA7, P/L). For most structured traits, the intragroup average phenotypic values and SD are given in Table III.

(Table II). Several of the kernel quality traits we phenotyped were strongly correlated. For instance, we found positive correlation among traits linked to protein content (r . 0.91 among PROT, THREO, METHI, ISOLEU, LEU, PHENYL, SAA, ASP; Table I) except LYSIN, which correlates positively but to a lower extent than protein traits (0.43 , r , 0.66), a negative correlation between starch and protein content (r 5 20.74 between STAR and PROT), a positive correlation between embryo size and saturated fatty acid content (r 5 0.82 between EMB and SATUR), and a positive correlation between kernel and endosperm weights (r 5 0.95 between KW and EW). These traits were synthesized into eight principal component analysis (PCA) axes accounting for more than 95% of the global phenotypic variance (Table II). The first PCA axis explains as much as 39.0% of the total variation and correlates positively to all traits related to protein content (PROT, SAA, THREO, METHI, ISOLEU, LEU, PHENYL; Table I) and negatively to starch content (STAR). The second PCA axis (r2 5 16.3%) is correlated to endosperm vitreousness (VITRO, AVITRO), which is known to affect kernel maturation, as indicated by the correlation to soluble sugar content (SGL).

Single Nucleotide Polymorphism and Insertion/Deletion Polymorphism Genotyping

Both CyPPDK1 and Opaque2 genes show a sizeable number of polymorphisms, with an average propor-

Table II. Descriptive statistics for phenotypic traits and correlations between NIRS traits and the eight first PCA axes Asterisks indicate P values as follows: * P , 0.05, ** P , 0.01, *** P , 0.001. Group

Mean

SD

h2a

r2 Groupb

2d

r : r2 groupb: TKW KDM ASH PROT LIP STAR SGL WALL AMAMI SATUR SACOSE GLCFRU THREO METHI ISOLEU LEU PHENYL LYSIN SAA KW EW EMB END AVITRO VITRO ASP P/S P/C L/S

226.06 89.64 1.59 13.16 3.51 68.95 1.23 11.5 31.57 0.11 87.33 0.96 0.5 0.24 0.53 1.83 0.76 0.34 4.22 337.62 282.26 9.71 83.64 55.33 65.23 3.46 0.19 3.93 0.05

59.1 0.24 0.1 1.52 0.79 2.59 0.38 1.21 3.73 0.02 2.71 0.1 0.05 0.03 0.06 0.23 0.1 0.03 0.5 23.35 19.72 0.98 1.44 6.6 7.65 0.4 0.02 0.99 0.01

a

77.9 85.1 92.1 89.8 88.2 92.8 84.4 91.3 94.4 90.6 83.1 92.0 92.0 92.0 91.9 91.9 86.9 91.9 85.0 83.8 91.0 87.2 93.1 93.8 92.1 90.5 88.2 86.4

14.6*** 0.9 11.0*** 3.2*** 5.0*** 3.2** 7.2 0.9 6.9*** 23.1*** 12.2*** 13.6*** 3.5** 2.1* 1.4* 1.1 0.9 4.8*** 1.6* 25.9*** 18.8*** 18.4*** 14.0*** 27.4*** 29.9*** 1.9* 3.5*** 1.4 5.4***

PCA1

c

PCA2

c

PCA3

c

PCA4

c

PCA5

c

PCA6

c

c

39.0 6.5***

16.3 1.0*

14.1 34.8***

7.2 4.8**

6.8 17.8***

4.5 11.2***

4.3 0.5

0.268 0.703 0.964 0.228 20.860 0.334 0.376 0.176 0.228 20.329 0.233 0.960 0.913 0.946 0.882 0.922 0.725 0.939 20.496 20.469 0.020 0.307 0.387 0.416

0.045 20.437 0.222 20.474 0.361 20.743 20.512 0.237 20.357 0.492 0.001 0.108 0.224 0.233 0.378 0.277 20.516 0.254 0.395 0.509 20.339 0.513 0.572 0.537

0.121 0.050 20.083 0.510 0.189 20.353 20.438 0.251 0.739 0.300 0.126 20.126 20.152 20.136 20.136 20.154 20.055 20.144 20.531 20.510 0.828 0.161 0.553 0.581

20.301 20.416 0.019 0.562 20.096 0.164 20.286 0.197 0.193 0.167 0.327 0.089 0.179 0.061 0.191 0.091 0.016 0.136 0.426 0.309 0.390 0.131 20.367 20.364

0.002 0.102 0.011 0.022 0.095 0.254 20.236 20.454 0.353 20.528 20.478 20.064 20.149 20.010 0.010 0.035 0.067 20.003 0.194 0.272 0.053 0.703 0.109 0.046

0.656 20.195 0.026 0.200 0.017 20.012 20.157 0.119 20.007 0.149 20.623 0.007 0.023 0.062 0.081 0.072 0.047 0.066 0.053 20.067 0.003 20.222 20.154 20.133

0.496 0.001 20.086 0.199 20.210 20.115 0.371 20.518 0.081 0.186 0.338 20.059 0.083 20.034 0.009 20.056 20.241 20.030 0.127 0.146 0.022 0.116 0.055 0.027

b Heritability. Determination coefficient (%; i.e. part of phenotypic variance explained by population structure). d obtained from kernel traits. Percentage of kernel trait variance explained by PCA axes.

Plant Physiol. Vol. 150, 2009

PCA7

c

PCA8

c

3.5 4.2*** 0.331 20.108 20.048 20.061 20.046 0.163 0.164 0.552 0.115 20.380 0.202 0.050 20.017 20.126 20.079 20.135 0.007 20.076 0.189 0.181 20.045 0.026 0.144 0.147

Eight first PCA axes

509

Manicacci et al.

Table III. Average phenotypic value (SD) per genetic group as defined using neutral markers (Camus-Kulandaivelu et al., 2006) for phenotypic traits that show strong effects of genetic structure (P , 0.01) Groupa

Trop

Nb

FT

TKW

76.5 1,201.8 (184.3) EF 66.6 950.4 (106.5) NF 54.8 931.6 (89.7) CBD 148.0 1,034.0 (118.9) SS 23.1 1,065.7 (110.7)

247.4 (63.4) 207.6 (54.7) 195.9 (53.1) 233.0 (54.7) 254.4 (46.2)

PROT

13.29 (1.64) 13.14 (1.57) 13.71 (1.36) 12.96 (1.44) 13.04 (1.51)

LIP

STAR

3.45 (0.76) 3.60 (0.61) 3.83 (1.04) 3.42 (0.76) 3.25 (0.66)

KW

68.54 (2.74) 69.29 (2.81) 68.18 (2.70) 69.20 (2.31) 69.44 (2.21)

EMB

342.5 9.56 (21.7) (0.95) 325.7 10.12 (201) (0.80) 321.4 10.30 (24.6) (1.16) 344.6 9.46 (20.2) (0.87) 350.9 9.17 (17.7) (0.75)

END

EW

84.4 (1.4) 83.9 (1.6) 83.7 (1.5) 83.2 (1.2) 83.2 (1.0)

AVITRO

287.7 (18.8) 275.0 (18.0) 269.5 (20.2) 286.1 (18.0) 292.1 (15.1)

PCA1

VITRO

55.18 (7.00) 59.60 (5.69) 58.90 (6.17) 52.51 (5.39) 53.30 (5.03)

64.59 0.14 (8.11) (1.07) 70.33 0.07 (6.45) (1.02) 69.83 0.49 (7.01) (0.92) 61.95 20.16 (6.18) (0.94) 62.82 20.17 (5.84) (0.96)

P/S

0.195 (0.031) 0.191 (0.029) 0.202 (0.026) 0.188 (0.026) 0.188 (0.027)

L/S

0.051 (0.012) 0.052 (0.010) 0.057 (0.017) 0.050 (0.012) 0.047 (0.010)

a

Genetic groups defined by Camus-Kulandaivelu et al. (2006) as follows: Trop, tropical origins; EF, European Flints; NF, Northern Flints; CBD, Corn b Belt Dent; SS, Stiff Stalk. Size of each group, as the sum of individual memberships across the 375 inbred lines.

tion of pairwise nucleotide differences among sequences varying from 0.006 to 0.015 (Table IV). Along the sequences, one polymorphism was found every 23 to 37 bp on average, with about half of them being singletons. LD among informative polymorphisms, often high among closely related sites, decreased significantly with physical distance (data not shown), leading to an important number of haplotypes and high haplotypic diversity. No evidence of selective events during the history of these genes was found on the basis of allelic frequency distribution among single nucleotide polymorphisms (SNPs; nonsignificant Tajima’s D; Table IV). However, the high number of haplotypes given the level of nucleotide diversity led to significant Strobeck’s statistics and Fu’s Fs for the middle region of CyPPDK1 and close to significant Strobeck’s statistics for the Opaque2 coding region (Table IV). Fourteen polymorphisms from CyPPDK1 and 10 from Opaque2 were chosen from these sequence data, based on their position, frequency, LD, and potential function, and were then genotyped on the 375 inbred lines (Fig. 1). Polymorphism C4879 in CyPPDK1 leads to a Leu/Phe amino acid replacement in the protein sequence, and polymorphism O1606 in Opaque2 leads to a Pro/Ala replacement. Other polymorphisms in both genes are either synonymous or located in noncoding regions. Most polymorphisms show balanced allele frequency, except two polymorphisms in Opaque2,

where one allele is present in fewer than 30 inbred lines (OP979 with 26 G alleles and O3243 with 12 T alleles; Fig. 1), suggesting that statistical power may be reduced for association studies involving those sites. Although significant LD is observed for almost all pairs of polymorphisms within CyPPDK1 (P , 0.0001) and many pairs of polymorphisms in Opaque2, r2 values higher than 0.50 are observed only for SNPs in the CyPPDK1 promoter and some SNPs in the Opaque2 promoter (Fig. 2). On the contrary, LD between genes is very low (r2 , 0.10, P . 0.01) and, given the large number of pairs of polymorphisms that were tested, could be considered as nonsignificant. For both genes, numerous haplotypes are observed among the 375 inbred lines. For CyPPDK1, 57 different haplotypes are observed out of 334 inbred lines genotyped for the 14 chosen polymorphisms, 30 of these haplotypes being observed for at least two inbred lines. For Opaque2, 19 different haplotypes are observed out of 202 inbred lines genotyped for the 10 chosen polymorphisms, 12 of these haplotypes being observed for at least two inbred lines. The number of minimum recombination events necessary to obtain these haplotypes considering no recurrent mutation is 11 and two for CyPPDK1 and Opaque2, respectively. All polymorphisms in CyPPDK1 show a high variation among groups (P , 0.0001), population structure explaining pseudo-r2 5 12.9% to 40.8% of allele frequency

Table IV. CyPPDK1 and Opaque2 nucleotide diversity and neutrality tests Asterisks indicate P values as follows: * P , 0.10, ** P , 0.05. Gene

Region

CyPPDK1 Promoter to intron 2 Exon 3 to exon 9 Exon 9 to 3# UTR Opaque2 Promoter 5# UTR to 3# UTRn a

Sizea

Nb

SNPic

SNPsd

SNPre

bp/SNPf

pg

uwh

Di

Hj

divHk

Sl

Fsm

2,024 1,638 2,264 887 1,941

18 24 16 19 18

42 21 37 23 43

27 23 46 6 28

3 14 9

29.3 37.2 27.3 26.2 23.4

0.0129 0.0062 0.0090 0.0154 0.0114

0.0138 0.0075 0.0115 0.0109 0.0124

20.384 20.717 20.923 1.632 20.353

15 18 13 10 16

0.961 0.942 0.950 0.854 0.987

0.933 0.997** 0.827 0.286 0.984*

21.479 24.370 20.490 1.791 22.608

25

c d Size of the sequenced region in base pairs. Sample size. Number of informative substitution polymorphisms. Number of e f singleton substitution polymorphisms. Number of replacement substitution polymorphisms. Average number of base pairs between g h SNPs. Average number of nucleotide differences per base pair between sequences. Number of polymorphic sites per base i j k l m Tajima’s neutrality index. Haplotype number. Haplotype diversity. Strobeck’s statistic (Strobeck, 1987). Fu’s haplotype statistic pair. n (Fu, 1997). Data from Henry et al. (2005).

510

b

Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

Figure 1. Allelic frequency at CyPPDK1 and Opaque2 polymorphisms among 375 inbred lines. Black, gray, and white bars indicate different alleles. Polymorphism positions are indicated along genes (solid arrows, polymorphisms in exons; dashed arrows, polymorphisms in introns; SNP names starting with OP or CP are located upstream the ATG start codon). For each gene, one SNP leads to a change in an amino acid.

variation. In Opaque2, the effect of genotypic group is significant for all SNPs in the promoter (P , 0.0003, pseudo-r2 5 10.3%–32.0%). In contrast, only one SNP in the coding region shows a strong association with population structure (O3243; P 5 0.0026, pseudo-r2 5 17.5%), while the two other SNPs show low (O3988; P 5 0.0205, pseudo-r2 5 6.9%) or no (O1866; P 5 0.2336; pseudo-r2 5 2.7%) allelic variation among groups. Phenotype-Genotype Associations

Models of association mapping that control for different levels of population structure and individual kinship were compared (Supplemental Fig. S1), and two models were retained: model Q that controls for population structure and model Q1KL that controls for both population structure and individual kinship estimated following Loiselle et al. (1995). Associations between Kernel Phenotypes and Candidate Genes

Significant associations between phenotypic traits and polymorphisms in CyPPDK1 and Opaque2 are reported in Table V. In CyPPDK1, the major associaPlant Physiol. Vol. 150, 2009

tions concern SNP C817, with many kernel traits related to protein and amino acid content (i.e. PROT, THREO, METHI, ISOLEU, LEU, PHENYL, ASP, and SAA contents) and protein-starch (P/S) ratio as well as PCA1 but excluding Lys content (LYSIN). In Opaque2, SNP O3988 is associated with many kernel traits, such as kernel and endosperm weight (KW, EW), Lys (LYSIN), Thr (THREO), and starch (STAR) contents, protein-starch ratio (P/S), and PCA1. OP1496, OP1539-2, and OP1600, which show strong LD among themselves, associate with END and PCA5. These results indicate that associations between phenotypic traits and polymorphisms at CyPPDK1 and Opaque2 are significant, independently from population structure and individual kinship. Epistatic Interactions between CyPPDK1 and Opaque2

Since Opaque2 and CyPPDK1 were shown to be functionally related (Maddaloni et al., 1996), we investigated whether pairs of SNPs from CyPPDK1 and Opaque2 have complementary or synergistic effects on kernel phenotypes. In order to do so, we tested for the effect of each CyPPDK1-Opaque2 combination either under an additive model or with interaction on each 511

Manicacci et al.

Figure 2. Half-matrix of LD among Opaque2 and CyPPDK1 polymorphisms. Multiallelic polymorphisms shown in Figure 1 were converted into biallelic data (see ‘‘Materials and Methods’’). [See online article for color version of this figure.]

phenotypic trait. Combinations of polymorphisms that are significantly associated with phenotypic traits are presented in Table VI. While O3988 SNP individually explains 1.2% of protein-starch ratio (either P/S or PCA1) variation, Table VI shows that this SNP in combination with a SNP in the CyPPDK1 promoter (CP125, CP161, CP509, or CP515) explains up to 7.9% of this variation. For instance, allele CP509-C combined with allele O3988-T induces a strong decrease in starch content and correlatively a strong increase in protein content and protein-starch ratio, as compared with any other allelic combination (Fig. 3A). The interaction between these SNPs also affects many protein- and amino acid-related traits (PROT, METHI, PHENYL, ISOLEU, LEU, SAA) and starch content (STAR). To a lower extent, the same phenotypic traits are associated with an epistatic combination of the promoter CyPPDK1 SNPs with O1866, the latter being in very slight LD with O3988 (r2 , 0.1, P , 0.01; Fig. 2). Combined effects of Opaque2 and CyPPDK1 SNPs are also found on Lys content. The additive combination of O3988 and C2252, a CyPPDK1 SNP located in intron 5, explains a significant part of Lys content variation (Fig. 3B). Finally, LYSIN is associated with a nonadditive combination of SNPs O1866 in the Opaque2 coding sequence and CP125 in the CyPPDK1 promoter (Fig. 3C). DISCUSSION Phenotypic Variation of Kernel Quality among Maize Inbred Lines

We report phenotypic variation in kernel quality traits among an extended collection of maize inbred lines encompassing material from tropical, North 512

American, and European origins. Most traits we measured can be summarized into three main PCA axes that together explain a major part (70%) of kernel phenotypic variation among our collection. The main PCA axis (39%) accounts for protein-versus-starch balance, consistent with the well-known negative correlation between protein and starch content in maize endosperm (Goldman et al., 1993). The second PCA axis (16%) accounts for endosperm texture, or vitreousness, and correlates negatively to Lys, soluble carbohydrate, and wall contents. The negative correlation between Lys content and vitreousness likely results from the impact of zeins, storage proteins devoid of Lys, on starch granule cohesion and thus endosperm texture (Landry et al., 2004). Finally, the third PCA axis (14%) accounts for lipid content. Overall, this allows us to describe three major and independent sources of kernel quality variation, which characterize our collection of 375 maize inbred lines. Using genome-wide neutral markers, CamusKulandaivelu et al. (2006) showed that this collection is structured into five genotypic groups that may be assigned, through independent knowledge of inbred origins and pedigrees, to Tropical, Northern Flint, European Flint, Corn Belt Dent, and Stiff Stalk origins. Most kernel quality traits we measured show significant variation among these groups. Based on both the variation in kernel traits and neutral marker-derived genetic structure, we can characterize three groups of contrasting kernel phenotypes: (1) materials of tropical origin with heavy grains, large endosperm, and no specific kernel composition; (2) early Northern Flint and European Flint materials with the smallest grains, small endosperm and relatively large embryo, high protein and lipid contents and low starch content, as Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

Table V. Significant associations between kernel quality traits and CyPPDK1 or Opaque2 polymorphisms in a collection of 375 maize inbred lines Asterisks indicate FDR values as follows: * FDR , 0.05, ** FDR , 0.01, *** FDR , 0.001. n.a., Nonavailable value due to the absence of convergence of the mixed model. Gene

CyPPDK1

Trait PROT THREO METHI ISOLEU LEU PHENYL ASP SAA PC1 PC2 PC6 P/S

Opaque2

ASH STAR THREO LYSIN KW EW END

PC1 PC5 P/S

SNP

PadjustQa

PQ1Kb

r2 (%)c

EffNd

EffQe

EffQ1Kf

C817 C817 CIDP33 C817 C817 C817 C817 C817 C817 C817 CP161 C1060 C817 O3988 O3988 O3988 O3988 O3988 O3988 OP1496 OP1539-2 OP1600 O3988 OP1496 OP1539-2 O3988

0.0007** 0.0033** 0.0452* 0.0041** 0.0005** 0.0066** 0.0013** 0.0037** 0.0022** 0.0155* 0.0358* 0.0396* 0.0016** 0.0001*** 0.0013** 0.0381* 0.0001*** 0.0110* 0.0057* 0.0007** 0.0037** 0.0144** 0.0026** 0.0100** 0.0131** 0.0238*

0.0008* 0.0011* 0.0031* 0.0008* 0.0006** 0.0014* 0.0008* 0.0010* 0.0008* 0.0036 0.0016* 0.0071 0.0009* ,1024*** 0.0006** 0.0024* ,1024*** n.a. n.a. n.a. 0.0006** n.a. 0.0004** 0.0007** 0.0005** 0.0018*

1.94 1.66 1.36 2.19 2.14 2.04 2.15 1.97 2.08 1.13 2.01 0.63 1.75 1.48 1.43 1.17 1.48 n.a. n.a. n.a. 2.90 n.a. 1.22 2.04 1.78 1.23

0.7665 0.0273 0.0185 0.0190 0.0366 0.1194 0.0564 0.2119 0.2696 0.3862 20.3450 20.3601 0.0135 20.0725 1.5226 20.0261 20.0289 8.559 7.138 1.0054 1.0012 0.9328 20.5434 0.7899 0.7946 20.0142

0.7333 0.0264 0.0187 0.0158 0.0339 0.1137 0.0526 0.1999 0.2540 0.3752 20.3318 20.2288 0.0127 20.0265 0.4294 20.0062 20.0111 5.257 4.559 0.6026 0.7072 0.5153 20.1550 0.3672 0.4862 20.0035

0.9332 0.0328 0.0220 0.0208 0.0422 0.1369 0.0647 0.2443 0.3103 0.5311 20.3416 20.2777 0.0170 20.0671 n.a. 20.0272 20.0293 10.91 9.922 0.8726 0.8225 0.7699 20.5662 0.5513 0.5427 20.0143

a

b P for the Q model that controls for population structure, estimated after 10,000 permutations using TASSEL software. P for the Q1K model c d that controls for population structure and kinship. Part of the phenotypic variance explained by the SNP in the Q1K model. Allelic effect e f Allelic effect corrected for population structure. Allelic effect (i.e. difference between average phenotypic values for both allelic forms). corrected for population structure and individual kinship.

well as high vitreousness consistent with the classical designation of this group (flint meaning vitreous); and (3) materials from the large maize-producing North American regions (i.e. Corn Belt Dent and Stiff Stalk) with grains of very large weight and high starch content, low protein and oil content, and low vitreousness. Because grains typical of cultivated forms have been found in archaeological sites dating as far back as 6,250 years ago (Benz and Long, 2000; Benz, 2001; Piperno and Flannery, 2001; Piperno et al., 2004), and because the cultivated form of domestication genes was found to be fixed among fossil maize soon after domestication (Jaenicke-Despres et al., 2003), kernel size and starch content are considered to have undergone human selection since the early stages of domestication. Later, selection may have diverged during local adaptation to diverse climates and grain usage. Northern Flint materials have been cultivated in northeastern America by Native Americans since approximately the 10th century (Smith, 1989) and gave rise to European Flint after being introduced to Europe in the early 16th century (Rebourg et al., 2003). These materials thus resulted from several centuries of selection for their ability to yield under temperate conditions (i.e. with short cycles). Although such selection may have mainly concerned the length of the biological Plant Physiol. Vol. 150, 2009

cycle and flowering date, it may have had side effects on plant height and yield, leading to smaller grains and lower starch accumulation. Additionally, their specifically high vitreousness may be due to either a specific genetic origin or specific selection processes for adaptation to culinary practices and possibly prevention of postharvest losses. Finally, the high average kernel weight, endosperm weight, and starch content observed for the Corn Belt Dent and Stiff Stalk materials are consistent with the well known intense selection for yield and starch content that was applied to these materials by plant breeders during the 20th century (Duvick and Cassman, 1999). The inbred lines derived from the Iowa Stiff Stalk synthetic group cluster separately from the Corn Belt Dent materials based on neutral markers and display particularly high kernel weight and starch content, which is consistent with the major role played by these materials, in particular through B14, B73, and their derivatives, in the genetic gain that has been achieved in the U.S. Corn Belt. The trend toward higher kernel size of Stiff Stalk materials may have been enhanced by their general use as females in hybrid combinations with male ‘‘non-Stiff Stalk’’ materials, such as Mo17, as illustrated by Pioneer breeding programs (Tracy and Chandler, 2006). 513

Manicacci et al.

Table VI. Significant associations between kernel phenotypic traits and combined Opaque2 and CyPPDK1 SNPs using the mixed Q1K model Boldface entries are P values for the genetic part of the model with FDR lower than 0.001, FDR being lower than 0.01 in all reported cases. Opaque2 SNP: CyPPDK1 SNP: PROT

STAR

THREO

ISOLEU

ASP

SAA

LYSIN

P/S

PCA1

ASH

P-Oa P-Cb P-Ic P gend r2 gene P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen P-O P-C P-I P gen r2 gen

O1866 CP125

CP161

0.02199 0.00798 0.00023 0.00054 4.9 0.29950 0.11632 0.00012 0.00098 4.5 0.03409 0.00825 0.00010 0.00039 5.0 0.03110 0.01717 0.00038 0.00113 4.6

0.12472 0.02548 0.00040 0.00096 4.7

0.19722 0.14640 0.00001 0.00004 6.6 0.03448 0.00748 0.00006 0.00024 5.3 0.03200 0.00807 0.00002 0.00011 5.4 0.04503 0.01827 0.00002 0.00012 5.5

CP509

0.16577 0.03850 0.00057 0.00173 4.4

C2042

O3988 C5098

CP125

CP161

CP509

CP515

0.04227 0.80667 0.00739 0.00048 4.9

C4879

0.02104 0.20828 0.00013 0.00016 5.5

0.05945 0.86261 0.00731 0.00074 4.8 0.08277 0.84162 0.00648 0.00078 4.5 0.07375 0.88764 0.00608 0.00072 4.8

0.04128 0.18436 0.00020 0.00043 5.0 0.02875 0.24445 0.00016 0.00021 5.4 0.04059 0.30403 0.00015 0.00019 5.5 0.03453 0.27023 0.00011 0.00014 5.6

0.00030 0.00130 0.00032 0.00010 6.8 0.00002 0.03623 0.00195 0.00007 7.0 0.00011 0.00118 0.00030 0.00005 7.2 0.00112 0.00241 0.00046 0.00035 6.1 0.00102 0.00357 0.00060 0.00041 7.0 0.00118 0.00310 0.00061 0.00044 5.9

0.00404 0.01985 0.00076 0.00037 5.9 0.00020 0.33573 0.00474 0.00005 7.0 0.00192 0.01997 0.00178 0.00041 5.8 0.01074 0.03998 0.00140 0.00125 5.2

0.00029 0.00314 0.00011 0.00006 7.1 0.00003 0.03250 0.00109 0.00005 7.1 0.00012 0.00141 0.00021 0.00004 7.2 0.00114 0.00594 0.00014 0.00017 6.5 0.00101 0.00729 0.00023 0.00022 7.3 0.00116 0.00714 0.00020 0.00022 6.3

0.00025 0.00079 0.00091 0.00010 6.7 0.00001 0.02982 0.00543 0.00006 7.0 0.00007 0.00049 0.00140 0.00004 7.2 0.00103 0.00138 0.00118 0.00039 6.0 0.00089 0.00187 0.00178 0.00047 7.0 0.00103 0.00174 0.00164 0.00048 5.8

0.12247 0.74040 0.00328 0.00206 4.3

C2252

,1025 0.00564 n.s. ,1025 9.5 0.19067 0.05768 0.00037 0.00124 4.5 0.19590 0.07553 0.00035 0.00113 4.3

0.03678 0.15728 0.00009 0.00019 5.3 0.03492 0.12793 0.00006 0.00013 5.2 0.07068 0.02650 0.00014 0.00068 4.4

0.00006 0.00109 0.00019 0.00002 7.6 0.00002 0.00292 0.00040 0.00002 7.5 ,1025 0.02753 0.00817 0.00003 7.4

0.00108 0.02932 0.00058 0.00010 6.7 0.00028 0.05955 0.00081 0.00004 7.0 0.00002 0.25891 0.00173 ,1025 8.7

0.00006 0.00223 0.00006 0.00001 7.9 0.00001 0.00578 0.00015 0.00001 7.8 ,1025 0.03305 0.00511 0.00003 7.4

0.00004 0.00068 0.00058 0.00002 7.5 0.00001 0.00147 0.00108 0.00002 7.4

a b c d P for the Opaque2 SNP. P for the CyPPDK1 SNP. P for the interaction between SNPs. P for the genetic part of the model (all three e Coefficient of determination for the genetic part of the model. previous sources of variation, excluding the interaction if not significant).

Genetic Structure of Polymorphism at CyPPDK1 and Opaque2 Candidate Genes

Both CyPPDK1 and Opaque2 genes show a high level of polymorphism among the collection of inbred lines, as compared with other genes studied in the same spe514

cies (Tenaillon et al., 2001), with one SNP every 30 and 24 bp for each gene, respectively. For both the whole region studied in CyPPDK1 and the coding region in Opaque2, the data show a significant excess of haplotypes (Strobeck statistic; Table IV) with negative and significant (for CyPPDK1) or close to significant Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

Figure 3. Average protein-starch ratio (P/S) and Lys (LYSIN) content depending on combined genotypes at CyPPDK1 and Opaque2 SNPs. The letters a and b indicate group classification of the four phenotypic means through Duncan’s test at a 0.05 threshold.

(for Opaque2) F values, indicating an excess of rare haplotypes as compared with the expectation under neutrality (Fu, 1997). This suggests that both genes may have evolved under purifying selection, probably due to their important functions in maize. For all SNPs we genotyped in the 375 inbred lines, we observed a strong differentiation in allele frequency among genetic groups, except for two SNPs in the Opaque2 coding region. Since population structure determined by neutral markers is clearly linked to the geographical origin of inbred lines (CamusKulandaivelu et al., 2006), variation in allelic frequency in candidate genes could result from either local adaptation, due to natural or human selection of traits determined by these genes, or genetic drift, due to reproductive isolation among groups for several tens or hundreds of generations. We found no correlation between allelic frequency and average phenotypic values among groups (data not shown), suggesting that no CyPPDK1 or Opaque2 SNPs are involved in local adaptation for kernel quality. Because the two causes of correlation between genetic and phenotypic variations may not be easily distinguished, the population structure should be taken into account in association analyses on the whole collection in order to avoid spurious associations (i.e. those due to genetic drift). Epistatic Interaction between Opaque2 and CyPPDK1 Promoters Modifies Kernel Protein-Starch Balance

We have shown that CyPPDK1 is significantly associated with diverse phenotypic traits linked to amino acid and protein contents as well as with the proteinstarch ratio. These associations all involve SNP C817, a polymorphism in the CyPPDK1 coding region, suggesting that this gene has a direct effect on amino acid synthesis. In the endosperm, CyPPDK1 reversibly catalyzes the conversion of pyruvate into PEP. Cytosolic PEP is directly involved in the synthesis of aromatic amino acids, including Phe, and is indirectly involved via oxaloacetate in the synthesis of Asp-derived amino acids, such as Thr, Ile, Met, and Lys. Interestingly, SNP C817 is associated with contents of all of these amino acids, except Lys. Consistently, all of these amino acid contents are strongly correlated with each other (r . 0.9) among our 375 inbred lines, while LYSIN shows a Plant Physiol. Vol. 150, 2009

significant although much lower correlation (0.43 , r , 0.66) with all of them. Together, these results indicate that CyPPDK1 plays an important role in aromatic and Asp-derived amino acid synthesis, except for Lys accumulation, which is probably subject to a more complex regulation. Finally, C817 is a synonymous SNP in exon 2 that shows low (r , 0.5) LD with the other SNPs in the CyPPDK1 coding region and no LD among the 17 fully sequenced inbred lines with nonsynonymous SNPs. Thus, although the associations between C817 and amino acid contents are strongly significant, the causative polymorphism is difficult to pinpoint and may not be C817 itself. Additionally, we found very strong associations between many phenotypic traits related to kernel starch and protein content and the combination of Opaque2 SNP O3988 and one of the SNPs in the CyPPDK1 promoter, CP125, CP161, CP509, or CP515. All of these phenotypic traits are strongly correlated with each other, and all four SNPs in the CyPPDK1 promoter are in strong LD, suggesting that all of these associations are due to the same cause. Opaque2 SNP O3988 alone is slightly associated with some of these traits, while none of the CyPPDK1 promoter SNPs is individually associated with kernel quality. In addition, the interactions between these SNP pairs are all significant (Table VI), indicating that kernel quality is modified by specific combinations of Opaque2 and CyPPDK1 alleles. More specifically, low starch content, high protein content, and high protein-starch ratio are only obtained for the simultaneous change of allele A to T at O3988 or allele T to C at CP509 (or G to A at CP125, A to G at CP161, or TG to a 2-bp deletion at CP515). These observations substantiate the hypothesis that the CyPPDK protein has a critical role in proteinstarch balance in the kernel, as suggested by the recent proteomic study of maize endosperm development (Me´chin et al., 2007). It was proposed that the pyruvate-to-PEP1PPi conversion by CyPPDK1 could both favor aromatic and Asp-derived amino acid synthesis, since PEP is a precursor in these pathways, and reduce starch synthesis through a PPi-induced decrease in ADP Glc, the starch precursor. This study shows that the natural polymorphism of the CyPPDK1 gene itself only slightly affects endosperm protein content and is not associated with starch content, 515

Manicacci et al.

whereas we found that protein and starch contents as well as protein-starch balance are affected by epistatic interaction between the Opaque2 coding sequence and the CyPPDK1 promoter. This suggests that an increase in the protein-starch ratio is unlikely to be achieved through selection of more efficient CyPPDK1 alleles but rather through specific combinations of compatible alleles that allow an increased activation of CyPPDK1 by its transcriptional activator Opaque2. From the study of the opaque2 mutant, it was shown that Opaque2 up-regulates CyPPDK1 transcription through DNA binding of the OPAQUE2 protein on two specific domains located at positions 163 to 172 and 295 to 304, following the same notation as that used to name CyPPDK1 SNPs in this study (Maddaloni et al., 1996). These two domains show no polymorphism among the 30 inbred lines initially sequenced; thus, no SNP has been defined in these regions. The CyPPDK1 promoter SNPs that we studied are in complete or very strong LD both between each other and with many other polymorphisms observed in the initial sequencing all along the promoter. We may thus assume that sequences other than the two domains described by Maddaloni et al. (1996) are also involved in the Opaque2-CyPPDK1 interaction, modifying the ability of Opaque2 to regulate CyPPDK1 transcription. These unknown sequences may be either close to the described domains (such as CP161, located 2 bp upstream of the most 5# domain), indicating that the domains may be a little longer than has been assumed, or at a different position, supporting the hypothesis of an additional Opaque2 binding domain within the CyPPDK1 promoter. O3988, the SNP involved in the Opaque2 interaction with CyPPDK1, is located in the 3# untranslated region (UTR). This suggests that either the 3# UTR is involved in a regulatory function that affects transcription or translation of Opaque2 or that the causal site lies upstream in the coding sequence or the promoter and shows strong LD with O3988. Complex Control of Lys Content by Opaque2 and CyPPDK1

The strongest association involving a single SNP that we found in this study is between Opaque2 polymorphism O3988 and Lys content (false discovery rate [FDR] , 1024 with Q and Q1K models). The recessive allele in the opaque2 mutant induces an increase in Lys content through a 50% to 70% reduction in Lys-free zeins, the main endosperm storage proteins (Landry et al., 2002), and an increase in Lys-rich proteins (Habben et al., 1993). Because of agronomical interest in essential amino acid contents for human and animal nutrition, this mutant allele has been extensively studied. However, so far no evidence has been reported that natural diversity in Opaque2 plays a role in Lys and storage protein content, since no QTL for any related phenotypic trait was found in the 7:01 region. This study shows that, within a large collection of maize inbred lines, Opaque2 natural polymorphism is 516

strongly related to Lys content. The potential role of CyPPDK1 in Lys content is probably too weak to be detected through single SNP associations but is revealed here in a more complete model that involves both C2252 and O3988 polymorphisms (Table VI). The strong association of O3988 with LYSIN and the absence of interaction with C2252 suggest an effect of Opaque2 on Lys content independent from CyPPDK1. This may be achieved through Opaque2 transcriptional control of Lys degradation by the LKR/SDH enzyme (Arruda et al., 2000). Indeed, in maize kernel, very little Lys is required for protein synthesis, since the main storage proteins (zeins) do not contain this amino acid. As a result, the kernel accumulates more Lys than is required, from both in situ synthesis and translocation from vegetative tissues, and Lys is thus continuously catabolized through the saccharopine pathway that involves both LKR and SDH activities. Carbon skeletons from Lys can then be directed toward zein synthesis. Our study suggests that natural diversity in Opaque2 affects Lys content through its degradation in the endosperm, thus playing a central role in storage protein synthesis and grain nutritional value. Another relevant association of Lys content with combined polymorphisms from both genes showed a very different pattern. Although no O1866 and CP125 SNPs are individually associated with Lys content, their interaction strongly correlates with its variation. Two combinations of alleles show high LYSIN (i.e. allele A at CP125 and allele T at O1866 or allele T at CP125 and allele A at O1866), while the two remaining combinations lead to low LYSIN. As for the proteinstarch balance association discussed above, this interaction involves a SNP in the coding region of Opaque2 and one in the CyPPDK1 promoter, suggesting that LYSIN may be increased by specific and efficient allelic interaction between the transcriptional activator Opaque2 and its target gene CyPPDK1. The two different SNP combinations found to be associated with Lys content (i.e. the additive O3988-C2252 and the epistatic O1866-CP125 combinations) suggest a complex regulation of this essential amino acid in the maize kernel. This paper reveals that natural variation in several kernel quality traits, such as Lys content and proteinversus-starch balance, depends on epistatic interactions between Opaque2 and the CyPPDK1 promoter. Epistatic interactions between loci have been shown to make a substantial contribution to complex trait variation in human and animals (Carlborg and Haley, 2004; Marchini et al., 2005) as well as in plants (Doebley et al., 1995; Mackay, 2001; Rowe et al., 2008; Jannink et al., 2009). In maize, many epistatic interactions between loci involved in yield components, through ear development or resistance to diseases, have been revealed from biparental mapping populations of contrasting parents (Doebley et al., 1995; McMullen et al., 2001; Ding et al., 2008) or from more complex populations, such as connected mapping populations, built to increase the power to detect QTLs and epistatic interactions (Blanc et al., 2006). Metabolic pathways Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

that underlie most complex traits are expected to involve multiple enzymatic and regulatory genes as well as interactions between them that could generate epistasis (McMullen et al., 1998). A clear example is the epistatic control of the pericarp color (p) locus, which encodes a transcription factor, on genes a1 and whp1, which determine maysin and chlorogenic acid accumulation in silks (Szalma et al., 2005). A priori knowledge of the epistatic effect of the p locus allowed these authors to control for the p genotype (functional versus nonfunctional) and detect a significant association between a1 or whp1 loci and silk maysin content. Kernel quality traits measured in this study, such as kernel weight, endosperm weight, starch, protein, and lipid contents, are important yield components that presumably rely on complex metabolic pathways. Although it is not surprising that epistatic interactions underlie the variation of such phenotypic traits, this work emphasizes the power of association genetics to detect the epistatic interaction between a transcriptional activator and one of its target genes in a large plant diversity collection.

MATERIALS AND METHODS Plant Material The association population we used consists of 375 inbred lines representative of American, European, and tropical maize (Zea mays), including both first generation lines (obtained by selfing from landraces) and recent elite lines. This collection includes the 102 inbred lines studied by Remington et al. (2001) and Thornsberry et al. (2001) and is fully described by CamusKulandaivelu et al. (2006, 2007). The complete list is available in Supplemental Table S1.

Phenotypic Data Kernel weight (TKW) and kernel composition traits (see abbreviations from to VITRO; Table I) were evaluated from an experimental trial including two locations and two replicates per location for each genotype. Because of large differences in flowering date within the collection, early materials were evaluated at Le Moulon and St. Martin de Hinx, while late materials were evaluated at Montpellier and St. Martin de Hinx. At each location, two main blocks were subdivided into four subblocks of comparable flowering time. Subblocks were organized into lines of 15 plants and sown at a density of six plants per square meter. Plants were self-pollinated in order to avoid xenia effects. Approximately eight ears were harvested per line when the subblock reached maturity and submitted to complementary mild drying with pulsed air at ambient temperature. For each line, kernel traits were predicted based on NIRS calibrated on entire kernels (Limagrain Society). Although the experimental protocol, using maturity blocks and ear drying, aimed at reducing variation in the percentage of dry matter among samples, thus avoiding bias in NIRS phenotypic estimations, KDM was introduced as a covariate in association studies in order to remove any potential residual effect of genetic differences in percentage dry matter. KDM

Population Structure and Individual Kinship Population structure, which generates genome-wide LD, is a major bias leading to false-positive associations (Flint-Garcia et al., 2003), especially in world-wide collections of structured species such as maize (Thornsberry et al., 2001). To control for population structure, all 375 inbred lines were genotyped for 55 genome-wide microsatellites (SSR) with motifs of three or more nucleotides, and five groups were defined (Camus-Kulandaivelu et al., 2006) using Structure version 2 software (Pritchard et al., 2000a). This

Plant Physiol. Vol. 150, 2009

provided us with four independent group memberships that were used as covariates in the genotype-phenotype association analyses. Recent studies suggested that such measurements of population structure may be insufficient to limit false-positive associations and that individual kinship coefficients should also be taken into account (Yu et al., 2006). We thus used the same 55 SSR markers to build kinship coefficient matrices following three different estimates referred to as KL (Loiselle et al., 1995), KR (Ritland, 1996), and KZ (Zhao et al., 2007). We estimated KL and KR using SPAGeDi version 1.2 software (Hardy and Vekemans, 2002) and KZ as the proportion of SSR alleles common to each pair of inbred lines.

SNP Genotyping In order to discover polymorphisms, we sequenced the entire CyPPDK1 gene (a total of 5.9 kb, including a 500-bp promoter region, all 18 exons, 17 introns, and 167 bp of the 3# UTR) on 30 inbred lines and the Opaque2 partial promoter on 18 inbred lines (884 bp). Sequences of 17 inbred lines for an Opaque2 2.7-kb coding fragment were available from Henry et al. (2005). We scored SNPs and insertion/deletion polymorphisms (IDPs) including singletons, informative, and nonsynonymous sites. We calculated the nucleotide diversity as the average number of pairwise differences among sequences per nucleotide site (Tajima, 1983) and the number of polymorphic sites (Watterson, 1975). We calculated haplotype diversity for each gene region (Nei, 1987). We tested for selective neutrality both from polymorphism frequency distribution (Tajima’s D; Tajima, 1989) and from haplotype number conditional to nucleotide diversity (S [Strobeck, 1987] and Fs [Fu, 1997]). We then characterized a subset of these polymorphisms for the 375 inbred lines described above, based on their position (favoring those in exons rather than introns), potential functional role (favoring nonsynonymous rather than synonymous changes), frequency (favoring balanced allele frequencies rather than rare alleles), and complementarity (avoiding redundancy among polymorphisms and favoring those that allow characterization of the highest number of haplotypes observed among the sequenced inbred lines). We genotyped 10 SNPs and four IDPs in CyPPDK1 and four SNPs and five IDPs in Opaque2 among the 375 inbred lines (Fig. 1). Large IDPs in CyPPDK1 (377-bplong IDP377 and 33-bp-long IDP33) were characterized by PCR/agarose gel electrophoresis and scoring of the fragment size at UMR le Moulon. SNPs and short IDPs were scored using the single-base primer-extension method at either Biogemma or UMR le Moulon. Primer sequences and complementary information on genotyping are available upon request. For LD and association studies, IDPs in Opaque2 that show three (OP1539) or four (OP904) alleles were coded as biallelic polymorphisms (i.e. presence/absence of each allele), leading to polymorphisms OP1539-0 to OP1539-2 and OP904-0 to OP904-3.

Statistical Analyses Determination of Individual Phenotypic Values We evaluated the effects of genotype, location, replicate within location, and genotype-location interaction on kernel quality traits through ANOVA using the GLM procedure in SAS (SAS, 1989). We estimated the heritability (part of the genetic variance among phenotypic variance) of each trait as 1 to 1/F, with F being the Fisher value of the genotype effect in the ANOVA model. Since the genotype-location interaction showed a much lower effect (10222 , P , 0.01) than the genotype (102245 , P , 102117), we calculated adjusted mean values for each trait and each inbred line (genotype) using the LSMEAN option in a model considering genotype, location, and replicate within location effects. Since many of the NIRS traits were highly correlated, we performed PCA analysis on the correlation matrix from adjusted means (FACTOR procedure in SAS). We analyzed the eight first PCA axes (more than 95% of the phenotypic variance) as summary phenotypic traits in association studies. We calculated four additional phenotypes from the NIRS adjusted means: total Asp-derived amino acids (ASP), protein-starch ratio (P/S), proteinlipid ratio (P/L), and lipid-starch ratio (L/S).

LD among SNPs For the groups of sequenced inbred lines, we tested for LD using Fisher’s exact tests among informative (i.e. nonsingleton) polymorphisms. Although polymorphisms characterized in the 375 inbred lines were chosen as not fully redundant among the sequenced lines, some significant LD may occur among

517

Manicacci et al.

the 375 inbred lines. In the large collection, we estimated LD either within or between genes as correlations (r2) among biallelic loci using the CORR procedure in SAS (SAS, 1989) and using TASSEL version 2.0 software (Bradbury et al., 2007) for graphical representation. In order to determine whether LDs among gene polymorphisms were mainly due to population structure or not, we estimated LD after removing the effect of population structure using logistic regression of SNPs against each other, including group memberships as covariates.

Among-Group Divergence for SNP Frequency and Phenotypes In order to determine whether population structure is correlated to phenotypic values or to SNP or IDP allelic frequencies, we tested the effect of group membership (four independent variables) on (1) phenotypic individual values using the GLM procedure and (2) polymorphisms (SNPs or IDPs) using the LOGISTIC procedure (SAS, 1989). We estimated the average phenotypic (or allelic frequency) value for each group as the mean of individual phenotypic (or genotypic) values weighted by individual group membership. We quantified the effect of population structure on phenotypic and genotypic variation using r2 (linear regression) and Max-rescaled pseudo-r2 (logistic regression), respectively. In cases where phenotypic value and polymorphism allele frequency showed significant variation among groups, we tested whether they were significantly correlated among groups, using the REG procedure in SAS and weighting by group size (estimated as the sum of group membership over all inbred lines). We performed correction for multiple testing by estimating FDR (Storey and Tibshirani, 2003) over all correlation tests.

Association Genetics

should be included in association studies of all phenotypic traits with candidate genes. Consequently, association studies presented here are based on both the Q model, which considers population structure only, and the Q1KR model, which is more complex but performs better for some phenotypic traits. Associations between Kernel Quality Traits and Polymorphisms at CyPPDK1 and Opaque2 Genes. We first performed association analyses on the raw data, correcting for KDM and population structure, in order to test for SNP-location interaction and SNP-replicate interaction within each location. Since these analyses showed nonsignificant interactions and gave very similar SNPphenotype associations to the ones performed per inbred line adjusted means, we do not present them here but rather present the results from Q and Q1KL models applied on adjusted means. For each gene and each phenotypic trait, we controlled for multiple testing using FDR (Benjamini and Hochberg, 1995). In these analyses, we could not use the improved FDR estimation method of Storey and Tibshirani (2003) since, for each phenotypic trait, only 14 and 15 associations were tested for CyPPDK1 and Opaque2, respectively. Finally, for each phenotypic trait separately, we tested the epistatic interactions of all 210 CyPPDK1-Opaque2 SNP combinations using both Q and Q1KL models and correcting for multiple testing (Storey and Tibshirani, 2003). In the mixed Q1KR model, we estimated genetic r2 as the difference between (1) the squared correlation coefficient between observed and predicted values under the total model including covariates and SNPs and (2) the squared correlation coefficient between observed and predicted values under a reduced model including covariates only. Sequence data from this article can be found in the GenBank/EMBL data libraries under accession numbers FJ935730 to FJ935747 for Opaque2 and FJ935748 to FJ935778 for CyPPDK1.

Supplemental Data The following materials are available in the online version of this article.

Evaluation of Different Statistical Models in Terms of False-Positive Detection. Population structure and/or relatedness among individuals may generate numerous false-positive associations between phenotypic variation and genotypic diversity. Different models have been proposed in order to reduce such false positives, taking into account population structure (Pritchard et al., 2000b) and/or individual kinship (Yu et al., 2006). Although population structure and kinship are generally evaluated from the same genome-wide neutral markers, it has been shown that they may not capture the same part of the phenotype-genotype associations (Aranzana et al., 2005). The statistical models we used all include kernel dry matter (KDM) as a covariate and consider (1) neither population structure nor individual kinship (N), (2) population structure only (Q), (3) relatedness only (KL, Loiselle’s kinship coefficient; KR, Ritland’s coefficient; KZ, proportion of shared SSR alleles), or (4) both population structure and relatedness (Q1KL, Q1KR, and Q1KZ). We performed these analyses through ANOVA (GLM) for the N and Q models and mixed linear model (MLM) for models including kinship matrices (Yu et al., 2006) using TASSEL version 2 software (Bradbury et al., 2007). In order to evaluate the number of false positives detected with each statistical model in our data, we tested associations between phenotypes and SSR alleles (coded as presence/absence in order to be comparable to biallelic SNPs or IDPs). The null hypothesis assumes that SSRs are not involved in phenotypic variation and thus that P values over all SSR-phenotype associations should be uniformly distributed (i.e. the cumulative distribution of P values should follow the diagonal; Supplemental Fig. S1). We showed that the naive model (ignoring genetic control, N curve) fails to correct for false-positive associations for almost all phenotypic traits, particularly for highly structured phenotypes (where population structure explains more than 10% of the phenotypic variance) such as TKW, KW, EW, EMB, or VITRO. For all of these traits, taking either population structure or individual kinship into account strongly reduces the excess of low P values, and controlling for both leads to an even flatter distribution. Supplemental Figure S1 shows that all three kinship coefficient estimates led to similar results for all phenotypes, although the KR and Q1KR models sometimes gave a better control of type I error rate than models involving KL or KZ (e.g. FT, LYSIN, EMB, LIP, TKW, KW, and EW). Finally, note that the KZ and Q1KZ models show many more cases of no convergence than other mixed models. The mixed model had been extensively used for the highly structured phenotype of flowering date (Yu et al., 2006; GonzalezMartinez et al., 2007; Zhao et al., 2007; Camus-Kulandaivelu et al., 2008). Our study confirms its efficiency in excluding false positives for kernel quality traits, and we consider that both population structure and individual kinship

518

Supplemental Figure S1. Cumulative distributions of P values for association tests between neutral SSR markers and maize phenotypes. Supplemental Table S1. Adjusted mean values for phenotypic traits (FT– WALL; see Table I for definitions) and group memberships (G1–G5) used in association mapping on 375 maize inbred lines.

ACKNOWLEDGMENTS We are grateful to J. Laborde, M. Dupin, P. Bertin, B. Gouesnard, D. Coubriche, S. Jouane, and P. Jamin for their contribution to seed management and field experiments and to L. Moreau and C. Dillmann for their advice for statistical analyses. We thank three anonymous reviewers for their relevant suggestions. We thank Prof. G. Noctor for his contribution to the English editing. Received October 30, 2008; accepted March 23, 2009; published March 27, 2009.

LITERATURE CITED Andersen JR, Schrag T, Melchinger AE, Zein I, Lubberstedt T (2005) Validation of Dwarf8 polymorphisms associated with flowering time in elite European inbred lines of maize (Zea mays L.). Theor Appl Genet 111: 206–217 Aranzana MJ, Kim S, Zhao K, Bakker E, Horton M, Jakob K, Lister C, Molitor J, Shindo C, Tang C, et al (2005) Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet 1: e60 Arruda P, Kemper EL, Papes F, Leite A (2000) Regulation of lysine catabolism in higher plants. Trends Plant Sci 5: 324–330 Austin DF, Lee M (1998) Detection of quantitative trait loci for grain yield and yield components in maize across generations in stress and nonstress environments. Crop Sci 38: 1296–1308 Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Statist Soc Ser B Methodological 57: 289–300

Plant Physiol. Vol. 150, 2009

Epistatic Control of Maize Kernel Quality

Benz BF (2001) Archaeological evidence of teosinte domestication from Guila Naquitz, Oaxaca. Proc Natl Acad Sci USA 98: 2104–2106 Benz BF, Long A (2000) Prehistoric maize evolution in the Tehuacan Valley. Curr Anthropol 41: 459–465 Blanc G, Charcosset A, Mangin B, Gallais A, Moreau L (2006) Connected populations for detecting quantitative trait loci and testing for epistasis: an application in maize. Theor Appl Genet 113: 206–224 Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23: 2633–2635 Burnell JN, Hatch MD (1986) Activation and inactivation of an enzyme catalyzed by a single, bifunctional protein: a new example and why. Arch Biochem Biophys 245: 297–304 Camus-Kulandaivelu L, Chevin LM, Tollon-Cordet C, Charcosset A, Manicacci D, Tenaillon MI (2008) Patterns of molecular evolution associated with two selective sweeps in the Tb1-Dwarf8 region in maize. Genetics 180: 1107–1121 Camus-Kulandaivelu L, Veyrieras JB, Gouesnard B, Charcosset A, Manicacci D (2007) Evaluating the reliability of structure outputs in case of relatedness between individuals. Crop Sci 47: 1–6 Camus-Kulandaivelu L, Veyrieras JB, Madur D, Combes V, Fourmann M, Barraud S, Dubreuil P, Gouesnard B, Manicacci D, Charcosset A (2006) Maize adaptation to temperate climate: relationship between population structure and polymorphism in the Dwarf8 gene. Genetics 172: 2449–2463 Carlborg O, Haley CS (2004) Epistasis: too often neglected in complex trait studies? Nat Rev Genet 5: 618–625 Cord Neto G, Yunes JA, Da Silva MJ, Vettore AL, Arruda P, Leite A (1995) The involvement of opaque-2 in beta-prolamine gene regulation in maize and Coix suggests a more general role of this transcriptional activator. Plant Mol Biol 27: 1015–1029 Crossa J, Burgueno J, Dreisigacker S, Vargas M, Herrera-Foessel SA, Lillemo M, Singh RP, Trethowan R, Warburton M, Franco J, et al (2007) Association analysis of historical bread wheat germplasm using additive genetic covariance of relatives and population structure. Genetics 177: 1889–1913 Ding JQ, Wang XM, Chander S, Li JS (2008) Identification of QTL for maize resistance to common smut by using recombinant inbred lines developed from the Chinese hybrid Yuyu22. J Appl Genet 49: 147–154 Doebley J, Stec A, Gustus C (1995) teosinte branched1 and the origin of maize: evidence for epistasis and the evolution of dominance. Genetics 141: 333–346 Ducrocq S, Madur D, Veyrieras JB, Camus-Kulandaivelu L, KloiberMaitz M, Presterl T, Ouzunova M, Manicacci D, Charcosset A (2008) Key impact of Vgt1 on flowering time adaptation in maize: evidence from association mapping and ecogeographical information. Genetics 178: 2433–2437 Duvick DN, Cassman KG (1999) Post-Green Revolution trends in yield potential of temperate maize in the north-central United States. Crop Sci 39: 1622–1630 Flint-Garcia SA, Thornsberry JM, Buckler ES IV (2003) Structure of linkage disequilibrium in plants. Annu Rev Plant Biol 54: 357–374 Flint-Garcia SA, Thuillet AC, Yu J, Pressoir G, Romero SM, Mitchell SE, Doebley J, Kresovich S, Goodman MM, Buckler ES (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44: 1054–1064 Fu YX (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147: 915–925 Gallusci P, Salamini F, Thompson RD (1994) Differences in cell typespecific expression of the gene Opaque 2 in maize and transgenic tobacco. Mol Gen Genet 244: 391–400 Gallusci P, Varotto S, Matsuoko M, Maddaloni M, Thompson RD (1996) Regulation of cytosolic pyruvate, orthophosphate dikinase expression in developing maize endosperm. Plant Mol Biol 31: 45–55 Goldman IL, Rocheford TR, Dudley JW (1993) Quantitative trait loci influencing protein and starch concentration in the Illinois long-term selection maize strains. Theor Appl Genet 87: 217–224 Gonzalez-Martinez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB (2007) Association genetics in Pinus taeda L. I. Wood property traits. Genetics 175: 399–409 Guillet-Claude C, Birolleau-Touchard C, Manicacci D, Fourmann M, Barraud S, Carret V, Martinant JP, Barriere Y (2004a) Genetic diversity associated with variation in silage corn digestibility for three

Plant Physiol. Vol. 150, 2009

O-methyltransferase genes involved in lignin biosynthesis. Theor Appl Genet 110: 126–135 Guillet-Claude C, Birolleau-Touchard C, Manicacci D, Rogowsky PM, Rigau J, Murigneux A, Martinant JP, Barriere Y (2004b) Nucleotide diversity of the ZmPox3 maize peroxidase gene: relationships between a MITE insertion in exon 2 and variation in forage maize digestibility. BMC Genet 5: 19 Gupta PK, Rustgi S, Kulwal PL (2005) Linkage disequilibrium and association studies in higher plants: present status and future prospects. Plant Mol Biol 57: 461–485 Habben JE, Kirleis AW, Larkins BA (1993) The origin of lysine-containing proteins in opaque-2 maize endosperm. Plant Mol Biol 23: 825–838 Hagenblad J, Tang C, Molitor J, Werner J, Zhao K, Zheng H, Marjoram P, Weigel D, Nordborg M (2004) Haplotype structure and phenotypic associations in the chromosomal regions surrounding two Arabidopsis thaliana flowering time loci. Genetics 168: 1627–1638 Hardy O, Vekemans X (2002) SPAGeDi: a versatile computer program to analyse spatial genetic structure at the individual or population levels. Mol Ecol Notes 2: 618–620 Harjes CE, Rocheford TR, Bai L, Brutnell TP, Kandianis CB, Sowinski SG, Stapleton AE, Vallabhaneni R, Williams M, Wurtzel ET, et al (2008) Natural genetic variation in lycopene epsilon cyclase tapped for maize biofortification. Science 319: 330–333 Hartings H, Maddaloni M, Lazzaroni N, Di Fonzo N, Motto M, Salamini F, Thompson R (1989) The O2 gene which regulates zein deposition in maize endosperm encodes a protein with structural homologies to transcriptional activators. EMBO J 8: 2795–2801 Henry AM, Manicacci D, Falque M, Damerval C (2005) Molecular evolution of the Opaque-2 gene in Zea mays L. J Mol Evol 61: 1–8 Hirel B, Bertin P, Quillere I, Bourdoncle W, Attagnant C, Dellay C, Gouy A, Cadiou S, Retailliau C, Falque M, et al (2001) Towards a better understanding of the genetic and physiological basis for nitrogen use efficiency in maize. Plant Physiol 125: 1258–1270 Ho C, McCouch R, Smith E (2002) Improvement of hybrid yield by advanced backcross QTL analysis in elite maize. Theor Appl Genet 105: 440–448 Jaenicke-Despres V, Buckler ES, Smith BD, Gilbert MTP, Cooper A, Doebley J, Paabo S (2003) Early allelic selection in maize as revealed by ancient DNA. Science 302: 1206–1208 Jannink JL, Moreau L, Charmet G, Charcosset A (2009) Overview of QTL detection in plants and tests for synergistic epistatic interactions. Genetica (in press) Kemper EL, Neto GC, Papes F, Moraes KCM, Leite A, Arruda P (1999) The role of Opaque2 in the control of lysine-degrading activities in developing maize endosperm. Plant Cell 11: 1981–1994 Landry J, Delhaye S, Damerval C (2002) Comparative efficiencies of isopropyl and tert-butyl alcohols for extracting zeins from maize endosperm. J Agric Food Chem 50: 4131–4134 Landry J, Delhaye S, Damerval C (2004) Protein distribution pattern in floury and vitreous endosperm of maize grain. Cereal Chemistry 81: 153–158 Liu KJ, Goodman M, Muse S, Smith JS, Buckler E, Doebley J (2003) Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165: 2117–2128 Liu X, Fu J, Gu D, Liu W, Liu T, Peng Y, Wang J, Wang G (2008) Genomewide analysis of gene expression profiles during the kernel development of maize (Zea mays L.). Genomics 91: 378–387 Loiselle BA, Sork VL, Nason J, Graham C (1995) Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). Am J Bot 82: 1420–1425 Lou X, Zhu J, Zhang Q, Zang R, Chen Y, Yu Z, Zhao Y (2005) Genetic control of the opaque-2 gene and background polygenes over some kernel traits in maize (Zea mays L.). Genetica 124: 291–300 Mackay TF (2001) The genetic architecture of quantitative traits. Annu Rev Genet 35: 303–339 Maddaloni M, Donini G, Balconi C, Rizzi E, Gallusci P, Forlani F, Lohmer S, Thompson R, Salamini F, Motto M (1996) The transcriptional activator Opaque-2 controls the expression of a cytosolic form of pyruvate orthophosphate dikinase-1 in maize endosperms. Mol Gen Genet 250: 647–654 Marchini J, Donnelly P, Cardon LR (2005) Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 37: 413–417

519

Manicacci et al.

McMullen MD, Byrne PF, Snook ME, Wiseman BR, Lee EA, Widstrom NW, Coe EH (1998) Quantitative trait loci and metabolic pathways. Proc Natl Acad Sci USA 95: 1996–2000 McMullen MD, Snook M, Lee EA, Byrne PF, Kross H, Musket TA, Houchins K, Coe EH Jr (2001) The biological basis of epistasis between quantitative trait loci for flavone and 3-deoxyanthocyanin synthesis in maize (Zea mays L.). Genome 44: 667–676 Me´chin V, Thevenot C, Le Guilloux M, Prioul JL, Damerval C (2007) Developmental analysis of maize endosperm proteome suggests a pivotal role for pyruvate orthophosphate dikinase. Plant Physiol 143: 1203–1219 Melchinger AE, Utz HF, Schon CC (1998) Quantitative trait locus (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics 149: 383–403 Nei M (1987) Molecular Evolutionary Genetics. Columbia University Press, New York Neuffer MG, Sheridan WF (1980) Defective kernel mutants of maize. I. Genetic and lethality studies. Genetics 95: 929–944 Olsen KM, Halldorsdottir SS, Stinchcombe JR, Weinig C, Schmitt J, Purugganan MD (2004) Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles. Genetics 167: 1361–1369 Osterberg MK, Shavorskaya O, Lascoux M, Lagercrantz U (2002) Naturally occurring indel variation in the Brassica nigra COL1 gene is associated with variation in flowering time. Genetics 161: 299–306 Piperno DR, Flannery KV (2001) The earliest archaeological maize (Zea mays L.) from highland Mexico: new accelerator mass spectrometry dates and their implications. Proc Natl Acad Sci USA 98: 2101–2103 Piperno DR, Weiss E, Holst I, Nadel D (2004) Processing of wild cereal grains in the Upper Palaeolithic revealed by starch grain analysis. Nature 430: 670–673 Pritchard JK, Stephens M, Donnelly P (2000a) Inference of population structure using multilocus genotype data. Genetics 155: 945–959 Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000b) Association mapping in structured populations. Am J Hum Genet 67: 170–181 Rami JF, Dufour P, Fliedel G, Mestres C, Davrieux F, Hamon P (1998) Quantitative trait loci for grain quality, productivity, morphological and agronomical traits in sorghum (Sorghum bicolor L. Moench). Theor Appl Genet 97: 605–616 Rebourg C, Chastanet M, Gouesnard B, Welcker C, Dubreuil P, Charcosset A (2003) Maize introduction into Europe: the history reviewed in the light of molecular data. Theor Appl Genet 106: 895–903 Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, Doebley J, Kresovich S, Goodman MM, Buckler ES (2001) Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc Natl Acad Sci USA 98: 11479–11484 Ritland K (1996) Estimators for pairwise relatedness and individual inbreeding coefficients. Genet Res 67: 175–185 Rowe HC, Hansen BG, Halkier BA, Kliebenstein DJ (2008) Biochemical networks and epistasis shape the Arabidopsis thaliana metabolome. Plant Cell 20: 1199–1216 Salvi S, Sponza G, Morgante M, Tomes D, Niu X, Fengler KA, Meeley R, Ananiev EV, Svitashev S, Bruggemann E, et al (2007) Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc Natl Acad Sci USA 104: 11376–11381 SAS (1989) SAS/STAT User’s Guide. SAS Institute, Cary, NC Scanlon MJ, Stinard PS, James MG, Myers AM, Robertson DS (1994) Genetic analysis of 63 mutations affecting maize kernel development isolated from Mutator stocks. Genetics 136: 281–294 Schmidt RJ, Burr FA, Aukerman MJ, Burr B (1990) Maize regulatory gene opaque-2 encodes a protein with a ‘‘leucine-zipper’’ motif that binds to zein DNA. Proc Natl Acad Sci USA 87: 46–50

520

Schmidt RJ, Ketudat M, Aukerman MJ, Hoschek G (1992) Opaque-2 is a transcriptional activator that recognizes a specific target site in 22-kD zein genes. Plant Cell 4: 689–700 Sheen J (1991) Molecular mechanisms underlying the differential expression of maize pyruvate, orthophosphate dikinase genes. Plant Cell 3: 225–245 Smith BD (1989) Origins of agriculture in eastern North America. Science 246: 1566–1571 Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100: 9440–9445 Strobeck C (1987) Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117: 149–153 Szalma SJ, Buckler ES IV, Snook ME, McMullen MD (2005) Association analysis of candidate genes for maysin and chlorogenic acid accumulation in maize silks. Theor Appl Genet 110: 1324–1333 Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations. Genetics 105: 437–460 Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585–595 Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS (2001) Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci USA 98: 9161–9166 Thevenot C, Simond-Cote E, Reyss A, Manicacci D, Trouverie J, Le Guilloux M, Ginhoux V, Sidicina F, Prioul JL (2005) QTLs for enzyme activities and soluble carbohydrates involved in starch accumulation during grain filling in maize. J Exp Bot 56: 945–958 Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler ES (2001) Dwarf8 polymorphisms associate with variation in flowering time. Nat Genet 28: 286–289 Tracy WF, Chandler MA (2006) The historical and biological basis of the concept of heterotic patterns in Corn Belt Dent maize. In KR Lamkey, M Lee, eds, Plant Breeding: The Arnel Hallauer International Symposium. Blackwell, Ames, IA, pp 219–233 Veldboom LR, Lee M (1994) Molecular-marker-facilitated studies of morphological traits in maize. II. Determination of QTLs for grain yield components. Theor Appl Genet 89: 451–458 Verza NC, E Silva TR, Neto GC, Nogueira FT, Fisch PH, de Rosa VE Jr, Rebello MM, Vettore AL, da Silva FR, Arruda P (2005) Endospermpreferred expression of maize genes as revealed by transcriptome-wide analysis of expressed sequence tags. Plant Mol Biol 59: 363–374 Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7: 256–276 Wilson LM, Whitt SR, Ibanez AM, Rocheford TR, Goodman MM, Buckler ES IV (2004) Dissection of maize kernel composition and starch production by candidate gene association. Plant Cell 16: 2719–2733 Young VR, Scrimshaw NS, Pellett PL (1998) Significance of dietary protein source in human nutrition: animal and/or plant proteins? In JC Waterlow, DG Armstrong, L Fowden, R Riley, eds, Feeding a World Population of More Than Eight Billion People. Oxford University Press, Oxford, pp 205–221 Yu J, Buckler ES (2006) Genetic association mapping and genome organization of maize. Curr Opin Biotechnol 17: 155–160 Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, et al (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203–208 Zhao K, Aranzana MJ, Kim S, Lister C, Shindo C, Tang C, Toomajian C, Zheng H, Dean C, Marjoram P, et al (2007) An Arabidopsis example of association mapping in structured samples. PLoS Genet 3: e4

Plant Physiol. Vol. 150, 2009