The detection of differentially expressed genes has contributed to understanding how developmental processes are conducted in a biological system such as tomato plant. In the field of gene-expression analysis, real-time reverse transcription PCR (RT-PCR) has become the method of choice for accurate expression profiling of selected genes [1–4]. Correct sample-normalization is an absolute prerequisite for reliable and accurate measurement of gene expression, especially when studying the biological relevance of small differences or when handling samples from different organs or tissues . The actual gold standard for controlling inter-sample variations, both in the amount and quality of cDNA inputs, is the use of suitable genes as endogenous controls . However, since there are no universal control genes, a set of potential references must be previously validated in each particular experimental background. Recently, an exhaustive analysis anchored on microarray data about expression profiles in Arabidopsis  allowed the identification of hundreds of potential reference genes, which show exceptional expression stability throughout development and under a wide range of environmental conditions. Despite its relevance as a model organism, certain biological processes are not tractable in Arabidopsis, such as the ripening of fleshy fruits which has received considerable attention in tomato. In addition, the conclusions derived from studies in Arabidopsis cannot be directly extrapolated to any vascular plant species. For example, UBQ10 gene shows highly stable expression in Arabidopsis , whereas it seems unsuitable for normalization in different tissues at different developmental stages in rice and soybean [24, 25]. This emphasizes the importance of preliminary evaluation studies, aimed to identify the most stable housekeeping genes in different organisms. Taking the above-mentioned arguments into account, we accomplished a systematic study of the expression stability of 11 housekeeping genes in Solanum lycopersicon, along a series composed of 27 samples from different tissues/organs at different developmental stages.
In an effort to minimize bias introduced by the validation approach, three different, yet complementary, statistical strategies were used to select the best internal controls for normalization of gene expression studies in tomato. The pairwise comparison strategy, accessible through the geNorm software , is a very popular option for verifying the expression stability of candidate genes. It relies on the principle that variations in the expression ratios of two housekeeping genes reflect the fact that at least one of the two genes is not constantly expressed. Its main advantage is that expression ratios allow a fine control of variations in the amount of cDNA inputs, because these oscillations associated to technical variability affect both paired genes equally. It has been argued that the major weakness of the pairwise comparison approach is its sensitivity to co-regulation, that is, it apparently tends to select those genes with the highest degree of similarity in their expression profile . However, it should be noted that the stability measure provided by geNorm (M) is the mean pairwise variation between a gene and all other tested candidates, and thus a pair of highly co-regulated genes could soon be eliminated during the selection process if they show high inter-sample variability. In addition, the advantage of two co-regulated genes is inversely proportional to the number of candidate genes being validated. An obvious prediction about behaviour of two co-regulated genes in the pairwise variation approach is that they will be scored with a similar M value. Indeed, there are numerous examples in the literature of genes belonging to the same functional class (typically different subunits of the same multiprotein complex) that are not top-ranked by the geNorm software, but which occupy closed positions in the ranking. Whatever that means, and since it is very difficult to foresee common expression patterns, the threshold cycle data were analyzed with two other statistical strategies that are less sensitive towards co-regulation of the candidate genes. On the one hand, the "model-based approach" implemented in the NormFinder software examines variation within and between sample groups that must be defined by the user. On the other hand, overall expression variation of each candidate gene was measured as the coefficient of variation (CV) of the normalized relative quantities (NRQ). The NormFinder approach stands out because it makes a balance of two sources of variation, but it does not account for systematic errors during sample preparation. Nevertheless, the CV strategy overcomes this drawback through the handling of normalized quantities, and may be a good alternative when the sample set cannot be appropriately subdivided. Although other valid statistical strategies have been successfully applied to control gene selection , the above-mentioned approaches are usually preferred because they are supported by user-friendly software.
Since the 3 statistical approaches complement one another their outcomes were equally weighted and combined into a consensus ranking. As the main result of this analysis, based on real-time PCR data, we proposed a tool-kit of control genes suitable for normalization of gene expression measures in a wide variety of samples in tomato. This tool-kit is composed of 4 housekeeping genes (CAC, TIP41, Expressed and SAND), which are recommended in different combinations depending on the sample origin (tables 3 and 4). Our analysis suggests that studies involving different tomato organs require at least 3 control genes for reliable and accurate normalization, while two control genes are sufficient for experiments within particular organs. The method of calculating a sample-specific normalization factor as the geometric mean of multiple carefully selected housekeeping genes  is currently the golden standard [3, 12]. This approach has been adopted by many researchers and has been empirically and statistically validated [26, 35–37]. Although the minimal use of three control genes has been proposed for the correct normalization of RT-PCR data  the actual optimal number of control genes should arise from a balance between economic considerations and accuracy, keeping in mind that normalization with multiple genes is less error-prone than single gene normalization [26, 35–37].
Among the housekeeping genes evaluated in the present study, DNAJ, GAPDH and TUA genes have been previously described as "candidate controls" in tomato plants . These genes were selected after the expression analysis of 127 transcripts in 27 expressed sequence tag libraries, but none of them was described as a suitable control gene for all tissues. Our results, based on data obtained with a more accurate and precise technique, lead to the conclusion that DNAJ gene may be useful for normalization in inflorescence samples (table 4) and, to a lesser extent, in leaves, fruits or a leaf/inflorescence developmental series (additional file 1). This is in accord with the results of Cocker and Davis . However, we suggest that GAPDH and, especially, TUA should be avoided as control genes because their expression stability is far from acceptable. For instance, the NRQs of TUA gene showed CVs higher than 180% in leaf and fruit samples. As another contribution of the present report, our results indicate that reliable normalization of the whole tomato developmental series is possible with the CAC, TIP41 and Expressed genes. Finally, the results reported herein are in good agreement with those described in Arabidopsis by Czechowski et al. guided by microarray expression data . In fact, the 4 control genes that we recommended for normalization in tomato are among the 5 top-ranked genes in Arabidopsis, although with a different relative position in the respective rankings. These novel control genes, as in Arabidopsis, are superior to traditional ones in terms of expression stability.