Global RNA sequencing reveals that genotype-dependent allele-specific expression contributes to differential expression in rice F1 hybrids

Background Extensive studies on heterosis in plants using transcriptome analysis have identified differentially expressed genes (DEGs) in F1 hybrids. However, it is not clear why yield in heterozygotes is superior to that of the homozygous parents or how DEGs are produced. Global allele-specific expression analysis in hybrid rice has the potential to answer these questions. Results We report a genome-wide allele-specific expression analysis using RNA-sequencing technology of 3,637–3,824 genes from three rice F1 hybrids. Of the expressed genes, 3.7% exhibited an unexpected type of monoallelic expression and 23.8% showed preferential allelic expression that was genotype-dependent in reciprocal crosses. Those genes exhibiting allele-specific expression comprised 42.4% of the genes differentially expressed between F1 hybrids and their parents. Allele-specific expression accounted for 79.8% of the genes displaying more than a 10-fold expression level difference between an F1 and its parents, and almost all (97.3%) of the genes expressed in F1, but non-expressed in one parent. Significant allelic complementary effects were detected in the F1 hybrids of rice. Conclusions Analysis of the allelic expression profiles of genes at the critical stage for highest biomass production from the leaves of three different rice F1 hybrids identified genotype-dependent allele-specific expression genes. A cis-regulatory mechanism was identified that contributes to allele-specific expression, leading to differential gene expression and allelic complementary effects in F1 hybrids.


Background
Heterosis, or hybrid vigor, refers to the superior biological functions of F 1 hybrids compared with their parental homozygous or inbred lines. This phenomenon was first described by Charles Darwin and was later independently rediscovered by George H. Shull and Edward M. East in 1908 [1][2][3]. Although it is not well understood at the molecular level, heterosis has been exploited over the past half-century in plant and animal breeding. Two classic hypotheses, "dominance" and "overdominance", have been proposed to explain hybrid vigor. The "dominance" hypothesis proposes that the detrimental allele from one parent is complemented by the superior allele from the other parent, and that the accumulated superior alleles in the F 1 hybrids give rise to heterosis. By comparison, the "overdominance" hypothesis signifies that hybrid vigor results from the interaction between alleles brought together in the hybrid [4].
To attempt to discriminate between these hypotheses, extensive studies of gene effects and transcriptomics have been conducted [5][6][7][8][9][10][11]. Genetic analyses have revealed the genetic effects of additive, overdominance, dominance, and epistasis, and that interactions between different loci are associated with heterosis in different varieties [5][6][7][8][9][10][11]. Several studies analyzing transcription have indicated that DEGs commonly occur in inbred lines as well as between F 1 hybrids and their parents. For example, 4-18% of maize genes are differentially expressed in different tissues of the maize inbred lines B73 and Mo17 according to microarray-based analyses [12]. In Arabidopsis seedlings, high-density single nucleotide polymorphism (SNP) analysis showed that 31% of all analyzed genes were differentially expressed between the parental inbred lines [13], while, in another study, 10.6% of genes were differentially expressed in different tissues of the hybrid rice LYP9 and its parents [14].
Recently, high-throughput RNA-sequencing technology revealed that 4-week-old shoots of the 93-11 and Nipponbare rice varieties had 24.0% DEGs, as did their reciprocal hybrids [15]. Moreover, a newly published global survey based on RNA sequencing technology found that approximately 70% of all expressed genes were differentially expressed between the two maize inbred parents B73 and Mo17, and that 42-55% were differentially expressed between the reciprocal F 1 and its parents [16]. Although the molecular basis of heterosis has been attributed to the DEGs in the above hybrids, the underlying mechanism(s) causing differential expression remain unknown.
Several studies have shown that only one allele is expressed in heterozygotes [17][18][19][20], and monoallelic expression or an imbalance in heterozygote allelic expression has been extensively studied in humans and other mammals [21][22][23]. Transcription profile analyses have indicated that monoallelic expression could be caused by X-chromosome silencing, autosomal imprinting, or random events. Some studies with vegetative tissues from maize F 1 hybrids identified several genes exhibiting allelespecific expression (ASE) [12,24,25], which differed markedly between the different F 1 hybrids and was altered in response to environmental stress. This could contribute to heterosis.
The objectives of the present study were to explore global ASE in hybrid rice and to reveal the mechanism of differential expression in F 1 hybrids using RNA sequencing. Three elite rice varieties were chosen that met the breeding objectives from different periods in China, Guangluai #4 (GL, 1970, Teqing (TQ, 1980, and 93-11 (1990s to present), plus their F 1 hybrids, which we show have different levels of heterosis. Two F 1 hybrids, GL × TQ and GL × 93-11, exhibited high heterosis, and the third, 93-11 × TQ, low heterosis. To obtain sufficient SNPs to distinguish the maternal and paternal alleles in F 1 hybrids, the genomes of the three parents were re-sequenced. To identify more SNPs for further ASE analysis, nuclear RNA was extracted from leaves of the three F 1 hybrids and their parents and subjected to Illumina RNA-Seq technology. We identify a global ASE profile that reveals a potential mechanism for an increased biomass-based, grain-yield heterosis.

Global ASE analysis by RNA sequencing
To obtain sufficient SNPs for ASE analysis, we achieved a rice genome coverage of 17.7-27.7 fold to satisfy the minimum requirement of obtaining more than 90% SNPs [26]. A total of 76.2-119.0 million reads (100 bp per read) from three rice varieties serving as parents for the F 1 s were obtained using Illumina DNA-Seq technology (Additional file 1: Table S1), and 375,744-411,571 SNPs were detected between each parent with 98.0% accuracy (Additional file 2: Figure S1A, Additional file 3: Table S2). A total of 89.5-114.1 million reads (90 bp per read) were obtained from analogous tissue from the three F 1 hybrids. Analysis disclosed that 29,064-29,928 genes at the secondary branch differentiation stage were expressed in the three F 1 hybrids and their parents (Additional file 1: Table S1). The expression levels of 30 genes in the F 1 hybrids and their parents were also determined, and quantitative real time polymerase chain reaction (qRT-PCR) and RNA-sequencing were shown to be highly correlated in terms of their analyses (expression level correlation coefficient r = 0.92-0.99 (Additional file 4: Table S3)). We found that 41,416-43,685 of the SNPs located in gene bodies had an average read coverage of 8.6-10.9 in the F 1 hybrids. These SNPs were available for allelic expression analysis (Additional file 2: Figure S1B and Additional file 5: Table S4). Of these, 9,752-10,818 genes accounting for 33.5-36.1% of the total number of expressed genes containing SNPs could be used for distinguishing the alleles (Additional file 6: Figure S2). A total of 3,627-3,824 genes met the criterion of at least 10 SNP reads, so could be used for further ASE profile analyses.
Through these ASE profile analyses, the F 1 hybrids were classified into three categories: monoallelic expression, in which only one allele from either the maternal or paternal parent is expressed; preferential allelic expression, in which expression levels differ by more than two-fold between alleles; and biallelic expression, in which expression levels vary by less than two-fold between alleles (Additional file 7: Figure S3). We found that 3.4-3.9% of the genes were monoallelic expression, 23.5-24.2% were preferential allelic expression, and 72.0-73.0% were biallelic expression ( Figure 1A-C). Further analysis of paternally and maternally derived read coverage in each heterozygote showed that the two parents contributed equally to the F 1 hybrids. Our results are consistent with those from Arabidopsis embryos [27], and suggest that no obvious parent-of-origin effect occurred in the vegetative tissues of the rice hybrids ( Figure 1A-C and Additional file 8: Figure S4).

A cis-regulation mechanism and genotype-dependent monoallelic expression
To understand the relationship between gene expression in the parents and ASE in the F 1 hybrids, the correlation between these parameters was determined for 3,627-3,824 genes, yielding correlation coefficients in the range of 0.70-0.76 (Figure 2A, P < 2.2E-16). A higher correlation coefficient was obtained from the same analysis with 2,026 ASE genes in the F 1 hybrids (Figure 2A, Pearson's r > 0.80, P < 2.2E-16). These results suggest that a cisregulatory mechanism is occurring, that is, if the allele is transcribed in the parent it is also transcribed in its F 1 hybrids, and if not in the parent then not in its F 1 hybrids. Our data also indicate that only 5.4-19.3% of gene expression in F 1 hybrids is non-additive. Trans-acting regulation may thus also contribute to the regulation of gene expression in rice hybrids, but would not have a major effect.
We identified 413 monoallelic expression genes in the three F 1 hybrids (143 from GL × TQ, 129 from GL × 93-11, and 141 from 93-11 × TQ) (Additional file 9: Table  S5). Of these, 108 were common to two F 1 hybrids, but none were common to all three ( Figure 2B). We randomly chose 134 monoallelic expression genes, accounting for 32.4% of the total, from the three hybrids and used reverse-transcription (RT)-PCR sequencing to verify a reliability of 91.8% (123/134) ( Figure 2C, Additional file 10: Table S6). Through further investigation of allelic expression patterns in three reciprocal crosses, GL × 93-11 and 93-11 × GL, GL × TQ and TQ × GL, and 9311 × TQ and TQ × 93-11, we detected 109 monoallelic expression genes and 14 preferential allelic expression genes and found that all monoallelic expression and preferential allelic expression genes tested exhibited a genotype-dependent expression pattern, while 17 biallelic expression genes showed no difference between reciprocal crosses ( Figure 2C-F, Additional file 11: Table S7). This shows that, regardless of the paternal or maternal origin in the reciprocal crossings, monoallelic expression and preferential allelic expression genes always express the allele from a given parent ( Figure 2C and E). Combining our results with previous observations from maize [24], we suggest that a hitherto overlooked type of monoallelic expression occurs in eukaryotic organisms.

ASE results in transcriptome divergence in the F 1 hybrids
Previous studies have demonstrated that higher levels of heterosis are associated with greater differences between the agronomic and/or metabolic traits of parents [28][29][30], and that DEGs (fold change >2.0, false discovery rate (FDR) <0.05) between F 1 hybrids and their parents are among the major factors leading to heterosis [14,31,32]. To ascertain the extent to which monoallelic expression and preferential allelic expression genes contribute to heterosis, we dissected the global DEGs between F 1 hybrids and their parents. The total number of DEGs in the three hybrids was correlated with heterosis level of both fresh and dry mass (r > 0.99, P < 0.03; Figure 3A-B).
To explore which mechanism(s) create DEGs in heterozygotes, we found that 95.1% of monoallelic expression genes, 50.3% of preferential allelic expression genes, and 30.4% of biallelic expression genes are DEGs ( Figure 3C). Surprisingly, we discovered that the monoallelic expression and preferential allelic expression genes, comprising 27.5% of the total analyzed genes, amounted to 42.7% of the DEGs ( Figure 3D). More specifically, the monoallelic expression genes, accounting for 3.7% of the total analyzed genes, gave rise to 10.0% of the DEGs, the preferential allelic expression genes (23.8%) amounted to 32.4%, and the biallelic expression genes (72.3%) were composed of only 57.6% of the DEGs ( Figure 3D). By determining contributions to DEGs between F 1 hybrids and their parents, we found that 57.2% of DEGs from monoallelic expression, 11.4% from preferential allelic expression, and 3.9% from biallelic expression exhibited a considerably greater than 10-fold difference. By contrast, 79.7% of DEGs from biallelic expression, 68.6% from preferential allelic expression, and 32.1% from monoallelic expression genes displayed a less than four-fold difference between the F 1 and their parents ( Figure 3E). The 71.6-83.5% of the genes that were expressed in the F 1 s but silenced in one of the two parents showed monoallelic expression patterns ( Figure 3F). These results demonstrate that the majority of DEGs (>10 fold difference) are attributable to ASE, and that monoallelic expression genes in particular play an important role in gene expression divergence between F 1 hybrids and their parents.

Complementary effects are mainly contributed by ASE genes
Because the presence of a cis-regulatory mechanism of allelic expression would be in accordance with the "dominance" hypothesis, we analyzed the transcriptomic profiles of all ASE genes in the three F 1 hybrids and their parents. A large number of genotype-dependent monoallelic expression genes in F 1 hybrids would also be in accordance with the "dominance" hypothesis. The results showed that an average of 51.6% (53.1% in GL × TQ, 55.0% in GL × 93-11, and 46.8% in 93-11 × TQ) of genes expressed in one parent and non-expressed in the other were categorized as monoallelic expression genes in F 1 hybrids ( Figure 4A class IV and Additional file 12: Table S8), whereas 30.2% (29.4, 27.1, and 34.0% in GL × TQ, GL × 93-11, and 93-11 × TQ, respectively) of genes expressed at low levels in either parent, but enhanced in F 1 hybrids, were categorized as monoallelic expression genes ( Figure 4A class  III and Additional file 12: Table S8). Most of both types of monoallelic expression genes exhibited a mid-parent expression level (Additional file 12: Table S8). Therefore, the alleles only expressed in the F 1 were those expressed in the inbred parent, while the alleles silenced in the inbred parent were also silenced in F 1 . This means that the total expression level of the gene in the F 1 is one half that in the parent, which is equivalent to a mid-parent expression level ( Figure 4A class IV and Additional file 12: Table S8). This dosage effect implies that a cis-regulatory mechanism is acting.
Moreover, we found that more monoallelic expression genes were downregulated than upregulated in F 1 (Additional file 13: Table S9), and that 9.7% of preferential allelic expression genes exhibited the same patterns as monoallelic expression genes in the F 1 hybrids ( Figure 4B classes III and IV). The analysis of genes within the categories of activated and enhanced expression in F 1 hybrids found complementary effects of superior alleles from both parents with average values of 73.8% and 93.6% for class III and class IV genes, respectively. By contrast, a biallelic expression pattern was exhibited by only 26.2% and 6.4% in class III and class IV genes, respectively ( Figure 4C and Additional file 14: Table S10). The proportion of genes with complementary effects in GL × TQ and GL × 93-11 was higher than that in 93-11 × TQ (7.8% and 7.5% versus 4.7%, respectively), which was consistent with the heterosis level of these F 1 hybrids (Additional file 14: Table S10). Our data imply that ASE genes, most notably monoallelic expression genes, are the main contributors to allelic complementary effects in hybrid rice.

ASE genes have diverse biological functions in F 1 hybrids
To ascertain the molecular and biological functions of monoallelic expression and preferential allelic expression genes, we performed gene ontology (GO) analysis. Four GO terms for molecular functions, nucleotide binding, receptor activity, protein binding, and kinase activity, were commonly found among both monoallelic expression and preferential allelic expression genes (Table 1  and Additional file 15: Tables S11 and Additional file 16:  Table S12). These functions indicate that monoallelic expression genes are mainly involved in protein modification, signal transduction, and response to endogenous stimulus pathways. A wider diversity of functions, including biosynthesis, morphogenesis, and carbohydrate metabolism, was found for preferential allelic expression genes (Table 1 and Additional file 17: Tables S13 and  Additional file18: Table S14), suggesting that ASE genes have important roles in biosynthesis, development, and their regulation. To distinguish between the enrichment of monoallelic expression and preferential allelic expression gene functions and that of all genes for ASE analysis, we performed GO analysis of 3,627-3,824 genes in three F 1 hybrids. Limited GO terms were common to monoallelic expression and preferential allelic expression genes in different F 1 hybrids (Additional files 19: Figures  S5 and Additional file 20: Figure S6), which might be a consequence of specific monoallelic expression and preferential allelic expression genes occurring in different F 1 hybrids, suggesting that allele-specific expression genes have different roles in different genomic backgrounds.

Discussion
Previous studies of transcriptome analyses related to heterosis have mainly been conducted using microarray analyses, and RT-PCR and RNA-sequencing technology [15,31,33]. Here, we combined RNA-sequencing with DNA re-sequencing technology to establish ASE assays and achieved a 20-fold coverage of rice genome resequencing data to identify SNPs. A strict statistical cutoff for SNP calling enabled fully quantitative analyses of both overall and allele-specific gene expression profiles to be obtained for rice leaves at the stage of secondary branch differentiation. These data were verified by PCRsequencing and RT-PCR sequencing using analogous materials planted in the following year. This developing stage is important as grain yield is directly correlated with the biomass established early in vegetative growth [34]. SNP accuracies of 98.0% and 91.8% were confirmed, indicating the reliability of both genome and transcriptome data, respectively.
We identified 413 monoallelic expression and 2,659 preferential allelic expression genes in the three F 1 hybrids via a global transcriptome analysis. Of the total genes analyzed, 3.4-3.9% exhibited monoallelic expression and 23.5-24.2% exhibited preferential allelic expression patterns. The proportion of ASE genes, moreover, did not differ significantly in the three F 1 hybrids, which is similar to a previous report in maize hybrids [24]. Importantly, our results indicated that all monoallelic expression and preferential allelic expression genes tested exhibited genotype-dependent expression patterns in reciprocal crosses, which contrasts with the findings of monoallelic expression genes in humans and other mammals [21,45]. The observed genotype-dependent ASE in the vegetative tissue of hybrid rice could represent a common mechanism of allelic complementary effects in hybrids, and show the importance of parental genotype in both crossbreeding and hybrid breeding. Given the number of genes with genotype-dependent monoallelic expression involved in a wide range of GO categories that play important biological functions, many potential opportunities exist for them to contribute to DEGs and to produce the diverse phenotypes of the F 1 hybrids.
In the present study, the genotype-dependent ASE genes might confer a fitness advantage to the heterozygous state relative to either of their homozygous parents. As most monoallelic expression genes were silenced or suppressed in one of the parents, we speculate that genotype-dependent monoallelic expression might be the consequence of artificial parent selection by breeders. To meet the breeding objectives, "superior alleles" are accumulated in the elite parent for long-term selection ( Figure 4A). Such alleles could maintain allelic diversity in different varieties and enlarge the allele difference in the rice germplasm. Therefore, our findings of allelic complementary effects in F 1 hybrid rice could guide the selection of elite parents through complementing a variety of the superior alleles.
Previous studies have indicated that differential gene expression is common between F 1 and parents, and that it is a major contributor to heterosis at the transcription level [14,31]. For example, hundreds of differentially expressed genes were detected at different developmental stages between the elite hybrid rice LYP9 and its parents [14], and analogous results were obtained from studies with Arabidopsis and maize [28,32,46,47]. Our study also found that the percentage of DEGs between F 1 hybrids and parents exactly correlated with the heterosis level of each F 1 hybrid ( Figure 3A and B).
Such results do not divulge, nonetheless, how the alleles from inbred lines become the DEGs in heterozygotes, although the mechanism behind this is revealed to some extent by genome-wide, ASE analysis using highthroughput RNA sequencing. Moreover, in the present study, the most significant DEGs between the F 1 and parent were expressed in the F 1 but silenced in one of the parents, accounting for 93.7% of the DEGs associated with ASE. Of these, 15.8% showed preferential allelic expression and 77.8% were monoallelic expression. Altogether, 27.5% of the ASE genes contributed to 42.4% of the total number of DEGs. Although the proportion of ASE genes was similar in the three F 1 hybrids, the total numbers of DEGs differed, with fewer detected in 93-11 × TQ ( Figure 3B) with its lower heterosis level ( Figure 3A) and more found in GL × TQ and GL × 93-11 with their higher heterosis levels. The same results were also found between their parents, suggesting that transcriptome divergence in F 1 hybrids is attributed to transcriptome divergence between parents. Crucially, increased parent allelic variation could be an important strategy for maintaining higher heterosis levels in hybrid rice breeding programs through enlarging the transcriptome diversity between parents, which have accumulated different sets of superior alleles.
A recent study has provided molecular evidence for a single gene model to support the "overdominance" hypothesis of heterosis in tomato hybrids [48]. Because of technological limitations, however, the maternal and paternal alleles in the F 1 hybrids could not be distinguished effectively. The recent development of high-throughput sequencing technology provides the opportunity to study ASE in heterozygotes. Many genes have exhibited additive expression patterns in F 1 hybrids in previous studies, and complementary effects at the gene expression level have been reported in hybrid maize [16]. Our data derived from global allelic expression profiles extend this result to hybrid rice, and also reveal the mechanism of DEG formation in heterozygotes. Our observed high correlation (>0.7) between allelic expression in F 1 hybrids and gene expression in parental lines indicates that a cis-regulated mechanism plays an essential role in allelic expression. Allelic complementation effects, moreover, can be the outcome of a cis-regulatory mechanism mainly contributed to by ASE genes.
The findings of the present study support the "dominance" hypothesis for indica hybrid rice varieties and reveal that the consequences of complementation, primarily by genotype-dependent monoallelic expression and preferential allelic expression alleles, are an accumulation of "superior" alleles that confer monoallelic expression and preferential allelic expression genes in F 1 hybrids. The expression of these "superior" alleles offers an opportunity to optimize the transcriptomes that give rise to heterosis in F 1 hybrids. Although our results revealed that allelic complementary effects played a major contribution to gene expression in hybrid rice and support the "dominance" hypothesis, they do not exclude the contribution of different mechanisms to heterosis in hybrid rice and other crops.

Conclusions
Allelic expression profiles in hybrid rice determined by RNA-sequencing technology demonstrated a type of genotype-dependent monoallelic expression genes in plants. DEGs between parents and F 1 hybrids were mainly attributable to ASE genes, which gave rise to the observed allelic complementary effects in F 1 hybrids.

Plant material and phenotypic analysis
Reciprocal crosses were made between the three indica varieties, Guangluai #4 (GL), 93-11, and Teqing (TQ), at Hainan Island, China in the spring of 2009. The six reciprocal F 1 hybrids were planted together with the three parents in Wuhan, China in the summer of 2009. Triplicate plots, each containing 30 individuals, were planted for all nine genotypes. Heterosis levels were evaluated by measuring the fresh and dry biomass of the above-ground parts of the plants at the secondary branch differentiation stage, which is critical for determining the highest biomass and maximum grain yield. The mid-parent heterosis level was calculated as described by Ma et al. [49]. Analogous leaves from the F 1 hybrids and parents at the same development stages in 2010 were used to verify gene and allelic expression and genotype-dependent monoallelic expression.

Nuclear RNA extraction
The second fully expanded leaves were harvested at the secondary branch differentiation stage, immediately frozen in liquid nitrogen and stored at −80°C. Material from the triplicate plots was pooled at harvest. Nuclei were isolated from~10 g of frozen leaves using the Plant Nuclei Isolation/Extraction Kit (Sigma, St. Louis, MO, USA). Total hnRNA was extracted from nuclei using Trizol (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions, and treated with RNase-free DNase I (New England Biolabs, Ipswich, MA, USA) to remove any contaminating genomic DNA.

Library construction
The Illumina mRNA-Seq Sample Prep Kit (Illumina, San Diego, CA USA) was used to prepare the sequencing library with 3 μg of nuclear RNA. Fragmentation buffer in the kit was added directly to hnRNA to produce short fragments of 200-700 bp, which served as the templates for first-strand cDNA synthesis using random hexamers. Second-strand cDNA was synthesized followed the protocol described in the kit and was purified using a QIAquick PCR Extraction Kit (Qiagen, Valencia,CA USA) and eluted in elution buffer (EB). The short fragments were then ligated to sequencing adapters. Suitable fragments of approximately 200 bp were selected as templates for amplification in a MyCycler PCR instrument (Bio-Rad, Hercules, CA USA) with the following program: denaturation at 98°C for 30 s followed by 15 cycles of 98°C for 10 s, 65°C for 30 s, and 72°C for 30 s plus a terminal hold at 72°C for 5 min. The samples were then purified using the QIAquick PCR Purification Kit according to the manufacturer's protocol and eluted in 30 μL of EB. A 1-μL aliquot of the construct was loaded onto an Agilent Technologies 2100 Bioanalyzer using the Agilent DNA 1000 Chip Kit (Agilent, Santa Clara, CA USA). After verifying the size and purity of the DNA fragments, the library was sequenced using an Illumina GA II x platform by BGI (Shenzhen, China).
The DNA Illumina TruSeq DNA sample preparation kit was used to prepare a genomic DNA library according to the manufacturer's protocol. Contaminating RNA was removed by treating with RNase A. The DNA samples were sent to the Beijing Novo Gene Company (Beijing, China) for sequencing using an Illumina Hiseq 2000 platform.

Read alignment
The raw reads were filtered before data analysis by removing reads consisting of adaptors only, those with greater than 10% unknown bases, and those in which more than half of the bases gave a quality score of less than 5.0. The remaining (clean) reads were mapped to the reference genome of Japonica variety Nipponbare (http://www.gramene.org/) using SOAP2 software [50]. Mismatches of no more than two bases were allowed in the alignment. We used the reads per kb per million reads (RPKM) method to calculate unique gene expression levels [51].

SNP identification
SOAP2 was used to align each read to the Nipponbare reference genome (http://www.gramene.org/), with no more than three mismatches to the candidate SNPs permitted for each read [50]. This method of SNP calling has been previously described [46]. A statistical model based on Bayesian theory and the Illumina quality system was used to calculate the probability of each possible genotype at every position from the alignment of reads to the reference genome. Six criteria were set to filter out unreliable SNPs: 1) the read quality value must be no lower than 20, 2) the SNPs must be at least 5 bp from each other, 3) the SNP must be represented by at least four reads, 4) the sequencing depth must be less than 10,000, 5) the SNP must be more than 5 bp distant from an intron-exon junction and 6) the approximate copy number of the flanking sequences must be less than four. SNP sets from each biological replicate of GL versus TQ, GL versus 93-11, and 93-11 versus TQ were obtained and used for further allele-specific analysis.

ASE analysis
The paternal and maternal alleles expressed in the F 1 hybrid transcriptomes were distinguished by their SNP nucleotype. The expression level was calculated based on at least 10 reads of single genes. The expression level from a paternal or maternal allele was calculated based on the number of reads for the given allele divided by the total number of reads for the SNP. When only one allele was expressed, the gene was categorized as showing monoallelic expression. When the allele expression level was biased toward one parent by more than twofold, the gene was categorized as showing preferential allelic expression. When the two alleles were expressed equally (less than two-fold difference), the gene was categorized as showing biallelic expression. In our computer analysis, the relative allele expression value of "0", which occurred when an allele was not expressed, could not be computed. Therefore, to enable calculations, the "0" was replaced by 0.001.

SNP confirmation and validation of allelic expression
For SNP validation, primers flanking the SNP sites were designed to amplify 300-800 bp fragments. PCR amplification was performed using reaction mixtures containing 10 ng of cDNA template and 5 μM primer. PCR was performed using the MyCycler PCR system with the following parameters: denaturation at 95°C for 5 min followed by 30 cycles of 95°C for 30 s, 50-55°C for 30 s, and 72°C for 30 s plus a terminal hold at 72°C for 5 min. PCR products were purified using the Axy PCR Cleanup Kit (AxyGEN Bioscience, Union city, CA USA) and sequenced using an ABI 3730xI machine. The resulting sequences were compared with reference SNPs detected by genome sequencing.
For monoallelic expression and preferential allelic expression validation, cDNA and gDNA from each F 1 sample were used as PCR templates, and the sequences derived from the genome and transcriptome were compared. For preferential allelic expression and biallelic expression validation, cDNA from reciprocal F 1 hybrids was used as PCR templates. The sequencing results for cDNA from reciprocal F 1 hybrids were compared to detect parent-of-origin effects.

GO and statistical analyses
GO analysis was performed using the open-source MAS3 database (http://bioinfo.capitalbio.com/mas3/). A threshold of a two-fold change in gene expression levels and a FDR of <0.05 were used to identify differentially expressed genes. The P-values and the FDRs of differentially expressed genes were calculated as described [52].

Gene expression level validation
To validate the gene expression level, 30 randomly selected genes with different expression levels were verified by quantitative RT-PCR as described by Wang et al. [53].