Validating internal controls for quantitative plant gene expression studies
© Brunner et al. 2004
Received: 20 December 2003
Accepted: 18 August 2004
Published: 18 August 2004
Skip to main content
© Brunner et al. 2004
Received: 20 December 2003
Accepted: 18 August 2004
Published: 18 August 2004
Real-time reverse transcription PCR (RT-PCR) has greatly improved the ease and sensitivity of quantitative gene expression studies. However, accurate measurement of gene expression with this method relies on the choice of a valid reference for data normalization. Studies rarely verify that gene expression levels for reference genes are adequately consistent among the samples used, nor compare alternative genes to assess which are most reliable for the experimental conditions analyzed.
Using real-time RT-PCR to study the expression of 10 poplar (genus Populus) housekeeping genes, we demonstrate a simple method for determining the degree of stability of gene expression over a set of experimental conditions. Based on a traditional method for analyzing the stability of varieties in plant breeding, it defines measures of gene expression stability from analysis of variance (ANOVA) and linear regression. We found that the potential internal control genes differed widely in their expression stability over the different tissues, developmental stages and environmental conditions studied.
Our results support that quantitative comparisons of candidate reference genes are an important part of real-time RT-PCR studies that seek to precisely evaluate variation in gene expression. The method we demonstrated facilitates statistical and graphical evaluation of gene expression stability. Selection of the best reference gene for a given set of experimental conditions should enable detection of biologically significant changes in gene expression that are too small to be revealed by less precise methods, or when highly variable reference genes are unknowingly used in real-time RT-PCR experiments.
For many years the vast majority of gene expression studies have employed non-quantitative or semi-quantitative RNA gel blots and RT-PCR analysis. Real-time PCR technology has removed many of the difficulties associated with quantitative gene expression studies , and real-time quantitative RT-PCR (qRT-PCR) is rapidly being adopted as a standard method for in-depth expression studies, including studies of alternative splicing, verification of microarray expression results, and molecular diagnostics [2–5].
Real-time qRT-PCR offers a robust means for precisely quantifying changes in gene expression over a wide dynamic range. It is also applicable to experiments where RNA amounts are limiting, such as for micro-dissected tissues. However, selection of an appropriate normalization method is crucial for reliable quantitative gene expression results [1, 6]. The purpose of normalization is to correct for non-specific variation, such as differences in RNA quantity and quality, which can affect efficiencies of the RT and PCR reactions.
Normalization to total RNA content poses a number of problems. It is difficult to quantify small amounts of RNA, and variation in RT and PCR reaction efficiencies are not accounted for by this method. Similarly, normalization to an external RNA standard is problematic due to RNA instability. The most commonly used method is relative quantitation, whereby gene expression level is normalized to that of an internal reference gene. While this avoids the problems and limitations of absolute quantitation, selection of a proper internal control–gene expressed at a nearly constant level in all tissue samples being investigated–is required. Failure to use an appropriate control gene may result in biased gene expression profiles, as well as low precision. The consequences may be that only gross changes in expression level are declared statistically significant, or that patterns of expression are erroneously characterized.
Until recently, internal controls (often referred to as housekeeping or maintenance genes), were selected based on stability of expression in qualitative studies (e.g., visual examination of RNA gel-blots), via low-sensitivity assays such as densitometry of hybridized blots, or via semi-quantitative RT-PCR. None of these will be adequate for identifying reliable internal controls for real-time qRT-PCR. For example, expression profiling via real-time qRT-PCR of 10 commonly used human internal control genes revealed different degrees and patterns of expression among 13 tissue types, and no single gene was a suitable universal control for all tissue types . Although 18S rRNA is frequently used as an internal control, it is far from ideal. It requires the use of total RNA and random primers for the RT reaction, and is expressed at very high levels; some means for attenuating 18S expression might be needed when weakly expressed genes are studied. In addition, there can be imbalances in rRNA and mRNA fractions between different samples, and 18S is not always expressed at a constant level in all conditions . Finally, 18S expression levels appear to be affected to a lesser extent by partial RNA degradation than are mRNA expression levels .
Studies in mammalian and microbial systems, where real-time qRT-PCR has been most extensively applied to date, have begun to include evaluations of various housekeeping genes for normalization [7–11]. Vandesompele et al.  recognized the importance of using statistical approaches to selecting the best internal controls for a given set of samples, and developed a procedure to select internal controls based on the mean pairwise variation of a gene from all other tested control genes. The adoption of real-time qRT-PCR methodology is somewhat reminiscent of the introduction of cDNA expression microarrays in that initial microarray studies did not identify differentially expressed genes by a statistical method, but by an arbitrary cut-off value of fold-change . Similarly, the first real-time qRT-PCR studies have generally normalized expression levels to an internal control that is assumed to be valid rather than one that has been shown to be valid by statistical analysis of data. More rigorous methods will be needed as qRT-PCR is increasingly applied to study of regulatory genes, and for verifying patterns observed in microarray experiments.
In this study, we used real-time qRT-PCR to examine the expression of 10 housekeeping genes in a diversity of poplar (Populus trichocarpa × P. deltoides, cottonwood hybrid) tissues collected at different developmental stages, and at different times of the year. The goal of our studies was to detect changes associated with seasonal development and tree aging for several regulatory genes. We therefore undertook a study to compare the stability of several potential control genes. We found that the genes tested exhibited very different degrees of variation in expression among tissue samples, and that a statistical and graphical method helped us to select the genes best suited for the developmental studies we were conducting. This approach, which is very similar to a classical method used by plant breeders to assess the relative stability in yield of different varieties , can be applied to any gene or set of tissues to identify the most stable internal controls.
Description of poplar genes and primers for real-time PCR.
GenBank accession numberb
Arabidopsis homolog locusc
Arabidopsis locus description
BLASTX score/ E value
Primer sequences (forward/reverse)
CACACTGGAGTGATGGTTGG / ATTGGCCTTGGGGTTAAGAG
CCCATTGAGCACGGTATTGT / TACGACCACTGGCATACAGG
GGCTAATTTTGCCGATGAGA / ACGTCCATCCCTTCAACAAC
tubulin alpha-3/alpha-5 chain
AGGTTCTGGTTTGGGGTCTT / TTGTCCAAAAGCACAGCAAC
tubulin beta-9 chain
GCACCAACTTGTTGAGAATGC / TTTCAACTGACCAGGGAACC
GTTGATTTTTGCTGGGAAGC / GATCTTGGCCTTCACGTTGT
TGAGGCTTAGGGGAGGAACT / TGTAGTCGCGAGCTGTCTTG
similarity to eukaryotic translation initiation factor 4B
AAAAAGGGGATTTGGGATTG / AACTTCGTCCTCGGTAGCAA
elongation factor 1-beta, putative
AAGAGGACAAGAAGGCAGCA / CTAACCGCCTTCTCCAACAC
18S ribosomal RNA
Poplar tissues used for gene expression studies.
Collection date (2001)
Overwintered terminal vegetative buds
Overwintered terminal vegetative buds approximately 1 week prior to bud flush
Newly expanding shoots (average shoot elongation = 38 mm)
Shoot tips, including unexpanded leaves
Shoot tips, including unexpanded leaves
Terminal vegetative buds
Terminal vegetative buds
Within a single experiment, aliquots of the same cDNA synthesis reaction were used for real-time PCR amplification of each of the 10 genes and all gene primer and cDNA combinations were amplified in triplicate in a single PCR run. The entire experiment was then repeated a second time and results combined for statistical analysis. Quantitation via real-time PCR is based on cycle threshold (CT). CT is the cycle at which a significant increase in amount of PCR product (measured by increase in fluorescence) occurs, generally the middle of the exponential phase of amplification.
Summary of statistics measuring stability of gene expression
In addition to constancy of expression level, the expression level of an internal control compared to that of the genes being analyzed might be important to consider in certain cases. In our study, two of the most stably expressed genes represented opposite ends of the spectrum. UBQ is highly expressed (mean CT = 15.8), whereas TUA is expressed at a much lower level (mean CT = 28.9) (Table 3). For the samples we tested, the high stability of UBQ and TUA expression indicate that use of either as a single internal control gene is appropriate. However, for some studies, no single gene may be adequate. In these cases, a method for normalization to two or more of the most stable internal control genes identified might be necessary. For example, normalizing to the geometric mean of selected internal control genes . A potential strategy to avoid the additional expense and labor of using multiple internal control genes is to design a PCR primer pair that will amplify two or more members of a control gene family, whose combined expression level may exhibit the desired expression level and stability.
Our primers were designed based on a limited EST set that likely did not include all family members, and ESTs vary in sequence quality. Thus, primers could have amplified more than one family member or primer mismatches due to EST sequence errors could have lowered PCR efficiency. Although gel and real-time PCR dissociation curve analyses did not indicate that multiple genes were amplified with our primer sets, these analyses might not detect multiple amplicons from different family members that are the same size and have the same PCR efficiency. As discussed above this is not necessarily a detriment–amplification of multiple family members might result in a more stable internal control than single gene amplification. In addition, the upcoming release of a large poplar unigene set and annotated genome sequence  will improve gene selection and primer design capabilities.
Using ANOVA and linear regression analysis, we demonstrated that levels of expression stability among a number of potential control genes can vary widely, and that it is not difficult, costly or labor-intensive to test a number of genes. Moreover, such validation tests might have the additional benefit of revealing technical problems, such as excessive variability in RT and PCR efficiency due to RNA quality or inconsistent pipetting.
For some experiments, choice of an internal control is straightforward. For example, a number of housekeeping genes should be satisfactory controls for comparisons of transgene expression level in the same tissue type from different transgenic lines grown under identical conditions. However, this is not the case for studies that compare gene expression among different tissue or cell types, at different developmental stages, or under different environmental conditions, as were represented in our study of trees in field environments over a period of seven months. For such studies, internal controls should be carefully tested and validated. Statistical confirmation of internal controls for qRT-PCR should enable previously indiscernible small changes in expression level to be reliability detected.
Tissues were collected from five or six year-old ramets (genetically identical trees) of a single female poplar hybrid clone (P. trichocarpa × P. deltoidies) over a seven-month period in 2001 (Table 1). The trees had been growing in commercial plantations in the Columbia River basin northwest of Portland, Oregon USA. Bud scales were removed and tissues were frozen in liquid N2 and stored at -80°C until RNA extraction. Total RNA was isolated using the RNeasy mini kit (Qiagen, Valencia, CA, USA) with modifications. Tissues (0.2 g) were ground to a fine powder with mortar and pestle in liquid N2. The powder was added to a tube containing 1 ml of RNeasy RLT buffer and 0.01 g soluble polyvinylpyrrolidone (PVP-40; Sigma, St. Louis, MO, USA), and homogenized using a polytron for approximately 30 sec. Four volumes of 5 M Potassium acetate, pH 6.5 was added to the homogenate, the mixture was incubated on ice for 15 min, and the precipitate removed by a 15 min centrifugation (12,000 rpm) at 4°C. Supernatant was transferred to two 1.5 ml microcentrifuge tubes and 0.5 volume of 100% EtOH was added. Samples were transferred to RNeasy mini columns and the remaining steps were as directed by the manufacturer's instructions for plant RNA isolation (steps 6–11). RNA was quantified using spectrophotometric OD260 measurements and quality was assesed by OD260/ OD 280 ratios and by electrophoresis on 1% formaldehyde agarose gels followed by ethidium bromide staining. RNAs were stored at -80°C.
To identify poplar homologs of genes commonly used as controls for plant gene expression studies, we queried poplar EST databases with Arabidopsis protein sequences using TBLASTN . Selected poplar ESTs were then used to query the Arabidopsis protein database using BLASTX (Table 2). Primers were designed using Primer3 software  or Primer Express (Applied Biosystems, Foster City, CA, USA) with melting temperatures of 59–60°C. By comparison to related poplar EST sequences, primers were designed to be as specific as possible for the selected gene family member. All primer pairs were initially tested via standard RT-PCR using the same conditions as described below for real-time RT-PCR. Amplification of single products of expected size was verified by electrophoresis on 3% agarose-1000 (Invitrogen, Carlsbad, CA, USA) and ethidium bromide staining.
Contaminating DNA was removed from RNA samples using the DNA-Free kit (Ambion, Austin, TX, USA) according to the manufacturer's protocol, and two-step real-time RT-PCR performed. cDNA was synthesized from 5 μg of RNA using the SuperScript first-strand synthesis system for RT-PCR (Invitrogen) with random hexamer primers according to the manufacturer's instructions, except that the initial 65°C denaturation step was omitted. The cDNAs were diluted 1:5 with nuclease-free water. Aliquots of the same cDNA sample were used with all primer sets for real-time PCR, and amplification reactions with all primer sets were performed in the same PCR run. Reactions were done in a 25 μl volume containing 600 nM of each primer, 6.5 μl of cDNA sample (≈320 ng of input RNA) and 1X SYBR Green PCR master mix (Applied Biosystems). For 18S amplification, primer concentration was 50 nm. Real-time PCR was performed on the ABI Prism 7700 Sequence Detection System (Applied Biosystems) in a 96-well reaction plate using the parameters recommended by the manufacturer (2 min. at 50°C, 10 min. at 95°C and 40 cycles of 95°C for 15 sec and 60°C for 1 min.). Each PCR reaction was performed in triplicate and no-template controls were included. Specificity of the amplifications was verified at the end of the PCR run using ABI Prism Dissociation Curve Analysis Software. The entire experiment, including both the RT and real-time PCR steps, was repeated, giving a total of two experimental replications.
Results (CT values) from the ABI PRISM 7700 Sequence Detection System were analyzed in Microsoft Excel. Single factor ANOVA and regression analysis using the least squares method were performed using the Excel Analysis ToolPak. Assumptions concerning homogeneity of variance and normality were evaluated from inspection of residuals (the difference between an observed value and overall mean for all genes) from the ANOVA (Fig. 1). The level and significance of the difference between gene expression levels in different samples were evaluated by Fisher's F statistic [F = between-tissue-sample mean square / error mean square] assuming the three replicate PCR reactions approximated variance between fully independent observations. Other statistics are as defined in Table 3.
The general procedure for data analysis to compare genes for use as internal controls was:
1) Generate data from multiple analyses of gene expression via quantitative RT PCR that can be assumed to be statistically independent (or nearly so), including from multiple independent samples that bracket the experimental conditions of interest.
2) Conduct ANOVA to examine the extent of variation among samples, and (optionally) test their significance using appropriate F-ratios. Examine plot of residuals vs. mean expression level, or use a statistical test, to check normality of data.
3) Genes showing high variance among tissues in ANOVA, especially if accompanied by large mean square errors (or coefficients of variation), are to be avoided as controls.
4) Calculate mean expression level over all genes studied for each sample type as an index of both experimental and biological conditions that promote high levels of measured expression. Use of a large number of genes and tissue samples (e.g., at least five, and preferably many more) are desirable where estimates of stability are to be compared between studies.
5) Regress mean expression level for each gene in each sample type over the mean for the sample type. The estimated slope and mean square residual (deviation from regression prediction) provide estimates of the degree to which the gene is sensitive to general expression-promoting conditions (slope) and whose expression continues to be difficult to predict (residual).
6) Assuming that both constancy over sample types (low slope) and high predictability (low coefficient of variation) are desirable, a stability index can be created as their product (or via other mathematical means), and the gene with the lowest value chosen.
7) Alternatively, visually inspect regression and residual plots to select genes that would be most suitable as controls for specific sets of experimental conditions.
We thank Olga Shevchenko for isolating the RNAs, Dr. Brian Stanton and Greenwood Resources (formerly Fort James Corporation) for permission to study their plantation trees, and Ove Nilsson and Frances Martin for providing poplar EST sequences prior to public release. I.A.Y. was supported by a Fullbright Fellowship. Partial funding for this work came from the Tree Biosafety and Genomics Research Cooperative at Oregon State University and the USDA NRI Competitive Grants Research Program (No. 2002-35301-12173).
This article is published under license to BioMed Central Ltd. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.