Skip to main content

Variation in morpho-physiological and metabolic responses to low nitrogen stress across the sorghum association panel



Access to biologically available nitrogen is a key constraint on plant growth in both natural and agricultural settings. Variation in tolerance to nitrogen deficit stress and productivity in nitrogen limited conditions exists both within and between plant species. However, our understanding of changes in different phenotypes under long term low nitrogen stress and their impact on important agronomic traits, such as yield, is still limited.


Here we quantified variation in the metabolic, physiological, and morphological responses of a sorghum association panel assembled to represent global genetic diversity to long term, nitrogen deficit stress and the relationship of these responses to grain yield under both conditions. Grain yield exhibits substantial genotype by environment interaction while many other morphological and physiological traits exhibited consistent responses to nitrogen stress across the population. Large scale nontargeted metabolic profiling for a subset of lines in both conditions identified a range of metabolic responses to long term nitrogen deficit stress. Several metabolites were associated with yield under high and low nitrogen conditions.


Our results highlight that grain yield in sorghum, unlike many morpho-physiological traits, exhibits substantial variability of genotype specific responses to long term low severity nitrogen deficit stress. Metabolic response to long term nitrogen stress shown higher proportion of variability explained by genotype specific responses than did morpho-pysiological traits and several metabolites were correlated with yield. This suggest, that it might be possible to build predictive models using metabolite abundance to estimate which sorghum genotypes will exhibit greater or lesser decreases in yield in response to nitrogen deficit, however further research needs to be done to evaluate such model.

Peer Review reports


Malthus predicted that exponential population growth would always surpass linear increases in food production resulting in constant famine [1]. Both dramatic increases in total agricultural land and technological innovations have staved off Malthusian catastrophy in the 20th and early 21st century. One of the key technological innovations was invention and widespread adoption of the Haber-Bosch process, which reduces atmospheric nitrogen gas (N2) to reactive forms of N, to provide an abundance and reliable source of nitrogen fertilizer for agriculture [2]. The widespread adoption of synthetic nitrogen fertilizers has dramatically increased crop yields but these increases have not come without some negative externalities, including increased greenhouse gas emissions and decreases in rural water quality [3]. In addition, for many non-irrigated agricultural systems the cost of fertilizer is either the single biggest variable input cost of production, or the second biggest after the cost of seed [4]. As the human population continues to grow and populations around the world shift to more calorie intensive diets, incentives and pressure on agricultural productivity will increase as well [5, 6]. It has been estimated that only 30-40% of nitrogen fertilizer is taken up and utilized by crops [7]. Increasing the nitrogen use efficiency of major agricultural crops would enable farmers to meet these growing requirements for food production with stable or decreasing applications of nitrogen fertilizer, increasing farmer profitability, and decrease the environmental and energy footprint of agriculture [8].

Substantial genetic variation in nitrogen use efficiency exists within crop plants [9, 10]. Between 1969 and 2010 European wheat breeders increased the nitrogen use efficiency of wheat by an estimated one third of one percent per year [11]. The global impact of a 1% increase in nitrogen use efficiency is estimated to be $1 billion dollars per year [12]. Understanding the genes controlling variation in nitrogen use efficiency and the other phenotypes associated with these differences would aid in both evaluating the feasibility of increasing nitrogen use efficiency in different crops – while sustaining the high yields necessary to meet global demand for food – and, where feasible, designing breeding strategies to achieve such an increase. However, nitrogen use efficiency is a complex trait and multiple morpho-physiological and metabolic mechanisms likely play roles in determining how well or poorly a given plant genotype can compensate for limited N availability in different environments and at different life stages.

Understanding the morpho-physiological and metabolic mechanisms associated with differences in tolerance for nitrogen deficit stress in agriculturally relevant environments represents a stepping stone to the subsequent identification of genetic loci associated with response to low nitrogen stress and finally to crop improvement via breeding or engineering. To date the majority of research on the morpho-physiological and metabolic responses of plants to nitrogen deficit stress has been conducted in controlled environment conditions, particularly emphasizing severe stress applied early in development [13,14,15]. The nitrogen deficit stress experienced by crops in agricultural settings is typically less extreme and may not produce obvious visual effects, but is sufficient to result in substantial grain or biomass decrease over the course of a growing season. Collecting phenotypic and metabolic data from large sets of genotypes experiencing agriculturally relevant degrees of stress under field conditions can provide substantial insight into natural variations in stress response and tolerance within individual crop species [16].

Here we quantified crop yield and eight morpho-physiological traits from a large and diverse sorghum population (Sorghum bicolor L.) grown to maturity in field conditions under both nitrogen limiting and non nitrogen limiting conditions. For a subset of 24 replicated genotypes, large scale metabolic profiling was conducted from leaf tissue collected at the flowering stage. Significant plasticity and genotype x environment interactions were observed for both yield and a subset of metabolic traits, while substantially less genotype x environment interaction was observed for morpho-physiological traits. The abundance of several metabolites at flowering exhibited significant correlations with plant performance (e.g. yield) at maturity.


Genetic variability of sorghum’s response to differential nitrogen application

A population of 347 sorghum genotypes drawn from the Sorghum Association Panel (SAP) [17] were grown under two nitrogen treatments with replication in Lincoln, Nebraska: low nitrogen (LN; no supplemental nitrogen) and high nitrogen (HN; 90 kg/ha, following local agronomic recommendations to avoid nitrogen limitations on yield in sorghum). A mixture of manually scored – leaf number, flag leaf length, flag leaf width, plant height, days to flowering– and phenotypes estimated from hyperspectral reflectance data – specific leaf area (SLA), chlorophyll content (CHL) and nitrogen (N), phosphorus (P) and potassium (K) content following previous workflow [18] – were collected from plants grown under both conditions (Table S1).The overall hyperspectral reflectance profile of sorghum leaves collected from plants grown in HN and LN treatments was similar (Fig. S1a), and neither of the first two principle components clearly separated the two treatments (Fig. S1b). Ground truth data were obtained for five traits: CHL, SLA, N, P, and K content from 265 samples, and partial least squares regression (PLSR) were used to predict values for scored six traits for whole panel based on hyperspectral data. Raw spectral data used for PLSR model building, as well as prediction traits were provided in Table S2. Employing five-fold cross validation with ground truth samples the accuracy (R2) of phenotypes estimated from hyperspectral reflectance data varied from 0.18 for P to 0.82 for CHL (Table S3). Similar performance was observed in the validation set (\(n=80\), Table S3 and Fig. S2) indicating models were not overfit. Prediction accuracy for K and P were low (R2 in validation set 0.22 and 0.25; Table S3), and these traits were excluded from downstream analyses.

The effect of N treatment was statistically significant for all traits evaluated, except for plant height (p < 0.05; likelihood ratio test (LRT); Fig. 1). Flag leaf width and flag leaf length were reduced by approximately 3.5% and 6.5% respectively under LN treatment. Plants grown under LN took 4% more time to flower. Larger differences were observed in chlorophyll and nitrogen content, with reductions of 15.3% and 13.8% respectively under LN treatment. However, the single largest impact of low nitrogen stress was observed on grain product, with a 48% reduction in grain yield under LN treatment.

Fig. 1
figure 1

Phenotypic difference of morpho-physiological traits across two treatment conditions. Statistical significance of N treatment were determined by likelihood ratio test (LRT) on mix model with treatment denote as fix effect and genotype as random. Asterisks indicate p-value < 0.05. Red dots indicated values for genotypes selected for metabolomics analysis. HN - high nitrogen, LN - low nitrogen. a - i Comparison of distribution of nine traits under HN and LN conditions

While overall population level responses to nitrogen deficit treatment were statistically robust, individual genotypes often exhibited different degrees of response to nitrogen treatment. A mixed model, considering genotype, treatment, and the interaction between genotype and treatment (genotype-by-environment, GxE) effects, was fit to each individual phenotypic dataset. A majority of the total variation in plant height and flowering time was explained by differences between genotypes (\(\sim\)91% and \(\sim\)85% respectively together on HN and LN; Fig. 2a). In the case of plant height none of variance was attributed to treatment or genotype by environment interaction. For flowering time, only \(\sim\)4% of variance were explain by treatment effect and \(\sim\)3% by GxE. The high degree of genetic control and low GxE effect is reflected in the high degree genetic correlation across treatment conditions for these traits: 0.86 for plant height and 0.8 for flowering time (Fig. S3). Variance in traits related to leaf (leaf number, leaf width, and leaf length) were also mostly explained by genetic factors (> 60% for each of these three traits). However, proportion of variance not explained by any of the factors in the model (e.g. the residual) was substantially greater for each of the three leaf related traits compared to plant height and flowering time, leading to lower correlations across treatment (\(\sim\)0.6; Fig. S3). Traits estimated from hyperspectral data (CHL, N, SLA) were comparatively much more plastic across environments (Fig. 2a), but only modest amounts of variance was attributed to GxE for each of these traits. One explanation for this, is that although our PLSR models were accurate (R2 > 0.6), it might be still not sufficient to precisely capture GxE.

Fig. 2
figure 2

Components of traits variation. a shows the proportion of variance attributed to each component for each trait. b shows the magnitude of this variance relative to each trait’s mean, using the coefficient of variation (CV; the estimated variance divided by the squared mean of the respective trait). HN - high nitrogen, LN - low nitrogen

Extensive plasticity of grain yield in response to nitrogen deficit stress was observed across the study population. Among analyzed traits, grain yield exhibited by far the largest proportion of variance attributable to GxE (Fig. 2a), resulting in only moderate genetic correlation between treatment (0.42, Fig. S2). Genotypes with high grain yield in the HN treatment tended to be somewhat more sensitive to low nitrogen stress than genotypes with low grain yield, even under the HN treatment (Fig. S4). However, the correlation between the responses of grain yield to nitrogen deficit stress and grain yield under HN was modest (\(\sim\)-0.3; Fig. S4). This reflects the relatively larger GxE effect of nitrogen treatment on yield. Although highly yielding genotypes on HN are more sensitive to LN stress, this reaction is not consistent and yield of some genotypes are less affected by LN stress.

Coefficients of variation were calculated for each variance component for each trait, following the approach described in [19]. Plant height exhibited the largest relative variation, particularly variation attributed to genetic factors (Fig. 2b), likely reflecting the effects of multiple large effect dwarfing genes segregating for functionally distinct alleles among the lines of the sorghum association panel [20]. The second largest relative variance was observed for grain yield, in particular for genetic factor under HN. However, relative variance from treatment conditions and GxE were also large, and in fact larger than the variance for any component among the remaining traits.

Metabolomic changes in sorghum leaves under long-term low nitrogen stress

As morpho-physiological traits scored in this study did not appear to explain the plasticity of sorghum grain yield across different nitrogen availability treatments, we next sought to characterize the responses of a large suite of metabolic phenotypes to differential nitrogen availability in the adult leaves across a subset of sorghum genotypes of the SAP. A set of 24 genotypes were selected to represent the phenotypic and genetic diversity of the SAP (Fig. 1, S5; [21]). Sampling was timed to coincide with anthesis, with a total of 96 leaf samples collected from two independent plots per genotype per treatment. Each sample was quantified via liquid chromatography - high-resolution mass spectrometry (LC-HRMS) analysis. In order to maximize the number of metabolites detected and quantified each sample was analyzed using both RP (reverse phase) and HILIC (hydrophilic interaction liquid chromatography) separations in both positive and negative ion mode, resulting in the detection and quantification of 115,782 mass spectral features. After filtering out features that were detected in less than 80% of samples, and further manual quality control (as described in the methods section), the number of features was reduced to 3,496, of which 145 could be assigned high confidence annotations (Table S4).

No obvious differences were observed in the distribution of estimated abundance values for high confidence metabolites (n = 145) between HN and LN conditions (Fig. 3a). Samples collected from plants grown in HN or LN were not clearly separated by either of the first two principal components of variation for the abundance of this set of high confidence annotated metabolites although samples collected from plants grown in LN exhibited a tighter distribution of PC1 values than did samples collected from plants grown in HN (Fig. 3b). After correcting for multiple testing via false discovery rate (FDR, [22]), the abundance of 62 metabolites changed significantly between samples collected from plants grown in HN or LN (FDR < 0.05). Thirty-four metabolites were more abundant in samples collected from plants grown under HN and 28 more abundant in samples collected from plants grown under LN (Fig. 3c). The vast majority of statistically significant changes in metabolite abundance were modest in size, defined as a less than two fold change in abundance between treatments. The majority of observed free amino acids (17/33) were significantly more abundant in samples collected from plants grown under HN but the amino acids acetylcarnitine and L-carnitine were significantly more abundant in samples collected from plants grown under LN (Fig. 3d, Table S4). In contrast, half of the phenolic compounds confidently identified in this dataset, such as the eight flavonoids, were significantly more abundant in samples collected from plants grown under LN (Table S4).

Fig. 3
figure 3

Metabolomics profiling in 24 sorghum genotypes across two nitrogen conditions based on 145 confidently annotated metabolites. a Distribution of the 145 confidently annotated metabolites across two treatment conditions. b First two principle component (PC) from PCA. Values in bracket indicate amount of variance explained by each component. c Volcano plot showing the down regulated (yellow), up regulated (green), and unhanded (grey) metabolites under low nitrogen (LN) conditions compare to high nitrogen (HN). d Proportions of the metabolites with know structures more abundant in samples collected from plants grown under HN (green), more abundant in samples collected from plants grown under LN (yellow), and unchanged (gray)

Similar results to those observed with the set of annotated metabolites were observed when analysed all 3,496 identified mass features. Overall abundance of those compounds was similar across treatment (Fig. S6a-b). Although 337 compounds were significantly different across two treatment condition (FDR < 0.05; Fig. S6c), those changes were rather small, with only 28 compounds being changed larger than two fold between treatments. Finally, PCA based on 3,496 mass features did not separate two treatment conditions (Fig. S6d). Interestingly, many of these unidentified metabolites showed relatively high heritability, with mean value 0.6 under HN and 0.68 under LN (Fig. S7). This suggests that natural variation in the contents of these compounds is genetically controlled, which makes a good prospect for furthering their identification and uncovering their biological meaning through genetic studies. In case of known metabolites, variation in the abundance of individual flavonoid and flavonoid glycosides compounds tended to be the most heritable across independent field plots of the same genotype grown in the same environment (Fig. S8).

A similar variance partitioning strategy to that employed for morpho-physiological traits was used to partition variance for each annotated metabolite. For each metabolite a mixed model was fit, including terms for genotypes (genetic effects), differences between N treatments (environmental effects), and genetic differences in the degree of response to N supply (genotype-by-environment, GxE). Likely as a result of the much smaller overall number of datapoints for each metabolic trait relative to each morpho-physiological traits, this model could only be successfully fit for 46 of 145 metabolites. Differences between genotypes typically explained around half of the variance for different metabolites (\(\sim\)28% on HN and \(\sim\)31% on LN), while the variance explained by environmental factor was much lower \(\sim\)2% (Fig. 4). The GxE effect explained on average of \(\sim\)7% of variance across the 46 metabolites where a mixed model was successfully fit. Despite the fact, that this value was not very high, it was higher than the average variance explained by the GxE effect for morpho-physiological traits (\(\sim\)1%). The coefficient of variation for each variable for each metabolite vary, but no clear pattern can be observed across different classes of metabolites (Fig. S9). A wide range of different patterns are exhibited by individual metabolites in response to LN stress across different genotypes. Glucose and sucrose both belong to the set of 83 metabolites which did not show any statistically significant differences in abundance between samples collected from plants grown in high nitrogen and plants grown in low nitrogen but which do exhibit consistent patterns of difference in abundance between genotypes across treatments (Fig. 5a-b). Serine, one the amino acids with a statistically significant difference in abundance between samples collected from plants grown in HN and LN exhibits a consistent decreased in abundance across genotypes with \(\sim\)15% of variance explained by the environmental factor (Fig. 5c). Glutamic acid and allantonin both exhibited large GxE effects of \(\sim\)15% and \(\sim\)30% variance for these two metabolites explained by GxE respectively (Fig. 5d-e). Genotypes with comparatively high glutamic acid content in HN saw larger reductions in glutamic acid content in LN. Genotypes with comparatively lower glutamic acid content in HN saw smaller reductions in LN. Previous study found a decrease in salicylic acid content under low nitrogen stress in sorghum root [23]. Here we found increase in salicylic acid content of sorghum leaves under LN (Fig. 5f). This response is consistent across majority of genotypes, although strength of this reaction slightly vary, with genotypes with low salicylic acid content under HN indicating a higher increase under LN.

Fig. 4
figure 4

Proportion of variance attributed to each component for each metabolite. HN - high nitrogen, LN - low nitrogen

Fig. 5
figure 5

Examples of unchanged a-b, changed but non-plastic c and plastic d-f metabolites. Each dot indicate genotypic mean and lines connect the same genotype across two treatment conditions. HN - high nitrogen, LN - low nitrogen

Correlation between metabolites and yield

The abundance of metabolites was correlated to some degree with observed grain yield values from the same plots (Fig. 6; Table S5). The correlation coefficients between metabolite abundance and grain yield in individual environments (HN or LN) were positively correlated with each other (r = 0.36, p < 0.05; Fig. 6). However, in only a modest number of cases where the correlations between the abundance of individual metabolites and grain yield statistically significant including six metabolites in HN and eleven in LN. Five metabolites were statistically significantly correlated with grain yield in both environments: 4-hydroxymandelonitrile, aconitic acid, ascorbic acid, benzamide and glucose. The strength of the correlations between grain yield and metabolite abundance where relatively modest (r < 0.6) even for those metabolites where statistically significant relationships were observed.

Fig. 6
figure 6

Scatter plot of correlation values of each metabolite and yield in given conditions. Marked metabolites indicate significantly correlated metabolites from Pearson analysis (\(p < 0.05\)). HN - high nitrogen, LN - low nitrogen

Two machine learning approaches, elastic-net regression (GLMNET) and random forest (RF), were evaluated for their potential to predict variation in plot level grain yield from combined metabolite abundance data. Three sets of input data were evaluated with each of the two machine learning approaches. First, the set of all detected metabolites (\(n=3,496\)). Second, the set of 145 metabolites with confident annotations. Finally, a set of 145 metabolites selected randomly from the complete set of 3,496 detected metabolites. Both algorithms achieved moderate prediction accuracy however, the accuracy of their predictions was either equivalent to or only modestly exceeded, the prediction accuracy of a simple linear model fit to only the treatment effect, which was able to predict 29% of the total variance in sorghum grain yield data (Fig. S10). Ascorbic acid showed the greatest contribution to the accuracy of the GLMNET model (Fig. S10b) and the third largest contribution with RF (Fig. S10c) in permutation based estimates of feature importance, consistent with the significant correlation between the abundance of this metabolite and grain yield in both conditions (Fig. 6).


Natural variation in tolerance to nitrogen deficient growing conditions has been widely studied in both crops and other plant species. However, the majority of these studies occurred in controlled environments and imposed substantial nutrient deficits that produced visible phenotypic responses even at seedling stages. Here we examined natural variation in both the morpho-physiological and metabolomic impact of long term low intensity nitrogen deficit at a level sufficient to alter grain yield and fitness in sorghum but which does not produce obvious visible stress symptoms.

Grain yield is a complex phenotype that is determined by a number of different component phenotypes (e.g. yield component traits). In Arabidopsis branching number is correlated with yield and previous study found large plasticity of this trait in response to low nitrogen stress [19]. In case of rice, various traits such as tiller number, grain number per penile, or 1,000 - grain weight are associated with yield. Interestingly, only tiller number were affected by low nitrogen stress [10]. Finally, in case of maize 1,000 - kernel weight were also not affected by low nitrogen, but substantial decreases in kernel number per cob were observed [13]. This observation highlights the complexity of how plant yield can be affected by low nitrogen stress. In this study grain weight per panicle was used to represent sorghum yield, and consistently with research done on Arabidopsis [19], large plasticity in response to low nitrogen stress was observed in this trait.

While grain yield decreased substantially under nitrogen limited conditions for the vast majority of sorghum genotypes, rank order grain yield under high nitrogen conditions was only modestly correlated with rank order grain yield under low nitrogen conditions (Spearman correlation = 0.44; Fig. S3). This suggests efforts to increase grain yield under nitrogen limited conditions will require separate field trials, evaluations and selections from breeding efforts to increase grain yield under non-nitrogen limited conditions. Yield under non-nitrogen limited conditions was negatively correlated with the size of the decrease in yield observed when nitrogen was limited (\(\sim\)-0.3; Fig. S4). However, because of large GxE effect, this reduction is not consistent across highly yielding genotypes. Some of the reductions are characterized by relatively low loss in yield under low nitrogen conditions. This indicates that it should be possible to produce varieties not only with high yield under high nitrogen condition but also more robust to low nitrogen stress. In contrast to grain yield, the morpho-physiological traits did not exhibit significant degrees of change in response to the degree of nitrogen limitation applied in this study, and of the traits which did exhibit significant effects – such as leaf nitrogen content and specific leaf area – the effects of treatment and genotype were largely independent of each other (Fig. 2). While changes in chlorophyll concentration were quantifiable using both handheld chlorophyll concentration meter and hyperspectral reflectance data, plants in the nitrogen limited field were not visibly chlorotic (personal observation). Metabolite abundance was characterized for a subset of sorghum genotypes in both conditions in an attempt to identify other phenotypes with potential value to predict how the grain yield of different sorghum varieties will respond to nitrogen limitation. The overall pattern of metabolite abundances did not exhibit substantial differences between nitrogen limited and non-nitrogen limited conditions (Fig. 3a-c). This is consistent with both the limited degree of change observed for morpho-physiological traits and the goal of imposing a degree of nitrogen limitation sufficient to alter fitness/grain yield but not so severe that it dramatically altered plant growth.

While overall differences in metabolite abundance between conditions were modest, the metabolites that did exhibit significant differences between treatments were consistent with expectations for nitrogen limited grown plants. Decreases in the abundance of many amino acids were observed (Fig. 3d; Table S4). Consistent with reports from studies of nitrogen deficit experiments in seedlings and adult maize leaves [13], sorghum roots [23], and maize, sorghum, and Paspalum vaginatum seedlings [24]. Disturbance in serine metabolism was previously found to play key role in limiting maize yield under low nitrogen conditions [9]. In addition, serine plays an important role in photorespiration [25], although in plants utilizing the C4 photosynthetic pathway, including both maize and sorghum this pathway is much less active than in plants utilizing the C3 photosynthetic pathway. We did not observe significant variation in the degree of decreased serine abundance observed among sorghum genotypes (Fig. 5c). Genetically controlled diversity for this trait may still be discovered in profiling of a larger panel of diverse sorghum lines under nitrogen limited and non nitrogen limited conditions. If no such diversity is found it may prove impossible to reduce this response to nitrogen constrained growth via conventional breeding and selection strategies. In contrast, while the abundance of glutamic acid also declined in nitrogen limited conditions, the degree of decline varied significantly among sorghum genotypes (Fig. 5d). In previous field studies of maize, the abundance of glutamic acid was negatively correlated with yield under heat and water stress but not under control conditions [16]. We observed a similar negative correlation between glutamic acid abundance and yield under both control and nitrogen limited conditions (Fig. 6). One potential explanation for this relationship might be, that genotypes with higher yield remobilise more nitrogen resources, including those coming from glutamic acid to fill kernels, and therefore they have lower level of glutamic acid. This result highlight potential importance of glutamic acid metabolism on yield in C4 crops under stress conditions. Genes involved in glutamic acid metabolism were enriched among those exhibiting differential mRNA expression between older maize inbreds (pre-1960s) and maize inbreds developed and selected by breeders in the modern era [26]. These observations suggest that glutamic acid metabolism may already have been an indirect target of selection during crop improvement in maize. If so, the data presented here suggest that glutamic acid may also represent an interesting metabolic marker when selecting for better performing sorghum genotypes although further validation is certainly needed.

Overall, our results highlight that grain yield in sorghum, unlike many morpho-physiological traits, exhibits substantial variability of genotype specific responses to long term low severity nitrogen deficit stress. Differences in the eight morpho-physiological traits scored in this study explained only \(\sim\)9% of variance in yield. Metabolic responses to long term low severity nitrogen deficit stress exhibited a higher proportion of variability explained by genotype specific responses than did morpho-pysiological traits and a number of individual metabolites were associated with yield variation under one or both nitrogen treatments. It may be possible to build predictive models using metabolite abundance to estimate which sorghum genotypes will exhibit greater or lesser decreases in yield in response to nitrogen deficit, however data from a larger number of genotypes grown across multiple sites will be necessary to train and evaluate such models. Large scale metabolic profiling will likely require targeted metabolomics using feature selection approaches to identify an informative subset of the metabolites profiled in this study.

Materials and methods

Field experiment, and phenotypic data and tissue collection

A replicated field trial was planted at the University of Nebraska-Lincoln’s Havelock Farm Location (N 40.861, W 96.598) on June 08, 2020. The experiment was laid out in a RBCD design, initially with two blocks each under high nitrogen (80 lbs/acre) and low nitrogen (no supplemental nitrogen) treatment conditions and 416 plots per block, including 347 genotypes from the sorghum association panel [17], and BTx623 as a repeated check. Each plot consisted of a single 2.3 meter row of plants from a single genotype, with 0.76 meter spacing between parallel and sequential rows. Before supplementing soil with nitrogen, soil sample from each block were taken and sent to a commercial lab for nitrate content analysis (Ward Laboratories, Inc., Kearney, NE). The nitrate content in blocks which was later supplement with nitrogen (HN) were 19.5, and 19.9 ppm respectively. In blocks without fertilizer (LN) the nitrate content were 13.7 and 9.45 ppm.

A mixture of hand measured traits and traits predicted from hyperspectral data (see below) were employed to assess the response of sorghum to nitrogen deficit stress. The date of flowering for each plot was scored when 50% of surviving plants reached anthesis. Plant height was measured after flowering and was defined as the distance from the soil surface to the collar of the flag leaf. The number of panicles produced by each plot (panicles per plot) were counted by direct observation. For each plot, one to three panicles were hand harvested, dried, threshed, and the resulting grain weighed. The weight of grain per harvested panicle was multiplied by the count of panicles per plot to estimate grain yield per plot. The difference in number of collected panicles primary come from difference in germination rate of different genotypes. A small number of sorghum genotypes had extremely poor germination rate and, as a number of panicles from this field project were reserved for another experiment that was not consistent with collecting total grain mass per panicle, it was not always possible to collect three panicles per plot.

Between 5 and 12 August 2020, hyperspectral reflectance data was collected from the second leaf from top of the plant from single plant per block using a FieldSpec4 (Malvern Panalytical Ltd., formerly Analytical Spectral Devices), following the protocol outlined in [18]. A set of 265 leaf samples (130 from HN and 135 from LN) were selected for ground truth measurements. Leaf chlorophyll concentration (CHL) was measured with a handheld chlorophyll concentration meter (MC-100, Apogee Instruments, Inc., Logan, UT), and leaf area (LA) was measured with a leaf area meter (LI-3100, LI-COR Biosciences, Lincoln, NE). Next, samples were placed in a oven set to 50\(^{\circ }\)C and dried over 72 h. Dry weight (DW) of the leaves was then recorded with digital balance. Specific Leaf Area (SLA, m2/kg) was calculated as LA/DW. Finally, dried plant leaves were sent to a commercial lab (Ward Laboratories, Inc., Kearney, NE) where the samples were ground, homogenized, and analyzed for analysis of nutrient content: nitrogen, potassium and phosphorus.

For 96 plots representing 24 genotypes replicated in two blocks each under sufficient and low nitrogen treatments, leaf tissue was collected for metabolomics analysis. For each plot, a single plant was selected, avoiding edge plants where possible. From this plant eight leaf punches of 0.33 cm2 in area were collected from the middle section of the leaf below the flag leaf (e.g the penultimate leaf) and immediately frozen in liquid nitrogen. Samples were collected between 9:00 AM and 1:00 PM on August 12 2020.

Modeling traits based on hyperspectral data

Five models were developed to predict chlorophyll, nitrogen, phosphorus, and potassium concentration as well as specific leaf area from hyperspectral reflectance data, following the approach described in [18]. Measured intensity values for each wavelength were zero centered and scaled to unit variance. Wavelengths below 450 nm and above 2400 nm were discarded. Predictive models were built separately for each trait using partial least squares regression implemented in the pls v.2.8.0 [27] and caret [28]. Prior to the modeling, data were split into training (\(n=185\)) and validation set (\(n=80\)). This was done to avoid the risk of misleadingly high prediction accuracy resulting from over fitting. Decisions regarding model tuning and performance evaluation were made based on root mean squared error (RMSE) of five-fold cross validation using training set (\(n=185\)). After final models were trained, their performance was evaluated using the validation set (\(n=80\)). Final models were applied to equivalently zero centered and scaled hyperspectral reflectance measurements collected from the remaining sorghum plots.

Untargeted metabolomics using LC-MS/MS

Samples were extracted using cold methanol:acetonitrile (50:50, v/v) spiked with 100 M of CUDA (12-[(cyclohexylcarbamoyl)amino]dodecanoic acid). The tissue samples were disrupted and homogenized by adding 2 stainless steel beads (SSB 32) using the TissueLyserII (Qiagen) at 20 Hz for 5 mins. After centrifugation at 16,000 g, the supernatants were collected and the same extraction was repeated on the pellet one more time. The supernatants were pooled and vacuum dried down using a SAVANT speed-vac. The pellets were re-dissolved in 100 \(\mu\)L of 30% methanol. Blank tubes were extracted alongside the samples to remove contaminant background from the data analysis. In addition, an aliquot of the samples was pooled to make a quality control (QC) sample which was run between every 10 samples in order to correct for batch effect. Two separate LC-MS/MS workflows running on a Thermo Vanquish LC system interfaced with a Thermo QE-HF mass spectrometer were used to profile the metabolites. For the hydrophobic compounds, a ACCQ-TAG ULTRA C18 column (1.7 \(\mu\)m , 2.1 mm × 100 mm, Waters) was used flowing at 0.3 mL/min at 40 \(^{\circ }\)C. The gradient of the mobile phases A (0.1% formic acid in water) and B (0.1% formic acid in acetonitrile) was as follow: 2% B for 2 min, to 50% B in 11 min, to 90% B in 2 min, hold at 90% B for 1 min, to 2% B in 0.5 min. The QE-HF was run in a data-dependent acquisition mode triggering on single charge peaks using a mass range of 67 to 1000 m/z at 60,000 resolution, with an AGC target of 3e6 and a maximum ion time of 100 ms for both positive and negative ion scans. The isolated ions were further fragmented by HCD using isolation window of 1.6 m/z and scanned at a resolution of 15,000. For the polar compounds, a XBridge Amide 3.5(4.6 x 100 mm, Waters) was used flowing at 0.4 mL/min at 45 \(^{\circ }\)C. The gradient of the mobile phases A (10 mM ammonium formate/0.125% formic acid in water) and B (10 mM ammonium formate/0.125 formic acid in 95% acetonitrile) was as follow: 100% B for 2 min, to 70% B in 5.7 min, to 40% B in 1.8 min, to 30% in 0.75 min, to 100% B in 2.5 min. The QE-HF was run in a data-dependent acquisition mode triggering on single charge peaks using a mass range of 60 to 900 m/z at 60,000 resolution, with an AGC target of 1e6 and a maximum ion time of 100 ms for both positive and negative ion scans. The isolated ions were further fragmented by HCD using isolation window of 1.6 m/z and scanned at a resolution of 15,000.

LC-MS/MS data analysis

Data from LC-MS/MS analysis were processed with MS-Dial software v4.70 for peak detection, deconvolution, alignment, quantification, normalization, and identification [29]. Background peaks detected in blank extracts were filtered out. Intensity drift was corrected using the local regression (LOESS) for QC batch normalization, and zero intensities were replaced by 10% of the minimum peak height. The identification was done using the curated mass spectral public libraries ( for MS/MS positive (290,915 entries, April 2021) and MS/MS negative (36,848 entries, April 2021). Metabolites missing in more than 80% of the total samples were removed. The remaining 3,496 metabolites from all four analytical conditions, that is from separation on HILIC and RP column with negative and positive ion mode, were manually checked for Gaussian chromatographic peak and, peak alignment and MS/MS profile. Identified metabolites were classified either as level I when peak matched to m/z and retention from an in-house library prepared from authentic standards, or as level II based on their spectral similarities with public/commercial spectral libraries in accordance with the Metabolomics Standards Initiative guidelines [30].

Phenotypic data analysis

All statistical analyses were conducted in R v.4.1.2 [31]. The meta-package tidyverse v.1.3.1 was employed for data processing and visualization [32]. In order to analyze the impact of the treatment effect on morpho-physiological traits and metabolites, mix-models were fit to each trait – after being transformed using the Box-Cox method – using the lmer function provided by the lme4 package [33]. The full model used in this study included treatment as a fix effect and genotype as a random effect. The reduced model included only genotype but not treatment. For each trait evaluated, the difference in fit between the full and reduced models were evaluated using the likelihood ratio test (LRT) to obtain p-values for the significance of treatment effects. P-values from metabolite data analysis were corrected for multiple tests using false discovery rate (FDR) [22], and values below 0.05 were considered to be statistically significant.

A more complex model which, in addition to treatment (nitrogen) as a fixed effect and genotype as a random effect, also included genotype by environment (GxE) interaction as random effects was fit for each metabolite in order to estimate total variance potentially explainable by each of these three factors [19, 34]. Metabolites for which variance estimated from the model for one or more parameters were zero, or close to zero, and therefore singular fit of the model was obtain, were excluded for analysis.

Broad-sense heritabilities were estimated from the following equation:

$$\begin{aligned} \textit{H}^2=\frac{\sigma ^2_g}{\sigma ^2_g + \frac{\sigma ^2_e}{n}} \end{aligned}$$

where \(\sigma\)2g is genetic variance, \(\sigma\)2e is residual variance, and n is the number of replicates. Variances were obtain from mixed model fitted separately to values from each experimental conditions with genotypes treated as random effect.

Principle component analysis for metabolite values across the 96 samples were calculated using the PCA function provided by the FactoMineR package [35]. Pearson correlation analysis between yield and metabolites were done with cor.test function in R.

Yield predictions were done based on three metabolite data sets: all identified metabolites (\(n=3,496\)), metabolites with confident annotation (\(n=145\)), and the same number of metabolites with unknown annotation (\(n=145\)). Analysis were done with caret framework [28]. Random forest were fitted with ranger package [36] and elastic-net regression with glmnet package [37]. Prior to the analysis, yield and metabolites values were Box-Cox transformed and scaled with preProcess function. Repeated 100x times five-fold cross validation were used to determinate optimal parameters for each model based on minimization root mean square error (RMSE). Importance value were calculated with varImp function from caret package based on permutation. The mean squared error is computed on the out-of-bag data for each model, and then the same computed after permuting a single variable. The differences are averaged and normalized by the standard error and scaled to values between 0 and 100.

Availability of data and materials

The hyperspectral datasets generated and analysed during the current study are available in the Ecosis repository, The metablomic datasets generated and analysed during the current study are available in the Metabolight repository,, under accession number: MTBLS4951.


  1. Malthus TR. An essay on the principle of population, as it affects the future imporvement of society, with remarks on the speculations of Mr. Godwin, M. Condorcet, and other writers. London: J. Johnson; 1798.

    Google Scholar 

  2. Erisman JW, Sutton MA, Galloway J, Klimont Z, Winiwarter W. How a century of ammonia synthesis changed the world. Nat Geosci. 2008;1(10):636–9.

    Article  CAS  Google Scholar 

  3. Zhang X, Davidson EA, Mauzerall DL, Searchinger TD, Dumas P, Shen Y. Managing nitrogen for sustainable development. Nature. 2015;528(7580):51–9.

    Article  CAS  Google Scholar 

  4. Rothstein SJ. Returning to our roots: making plant biology research relevant to future challenges in agriculture. Plant Cell. 2007;19(9):2695–9.

    Article  CAS  Google Scholar 

  5. Foley JA, Ramankutty N, Brauman KA, Cassidy ES, Gerber JS, Johnston M, et al. Solutions for a cultivated planet. Nature. 2011;478(7369):337–42.

    Article  CAS  Google Scholar 

  6. Ramankutty N, Mehrabi Z, Waha K, Jarvis L, Kremen C, Herrero M, et al. Trends in global agricultural land use: implications for environmental health and food security. Ann Rev Plant Biol. 2018;69:789–815.

    Article  CAS  Google Scholar 

  7. Raun WR, Johnson GV. Improving nitrogen use efficiency for cereal production. Agron J. 1999;91(3):357–63.

    Article  Google Scholar 

  8. Hakeem KR, Ahmad A, Iqbal M, Gucel S, Ozturk M. Nitrogen-efficient rice cultivars can reduce nitrate pollution. Environ Sci Pollut Res. 2011;18(7):1184–93.

    Article  CAS  Google Scholar 

  9. Cañas RA, Quilleré I, Gallais A, Hirel B. Can genetic variability for nitrogen metabolism in the developing ear of maize be exploited to improve yield? New Phytol. 2012;194(2):440–52.

    Article  Google Scholar 

  10. Liu Y, Wang H, Jiang Z, Wang W, Xu R, Wang Q, et al. Genomic basis of geographical adaptation to soil nitrogen in rice. Nature. 2021;590(7847):600–5.

    Article  CAS  Google Scholar 

  11. Cormier F, Faure S, Dubreuil P, Heumez E, Beauchêne K, Lafarge S, et al. A multi-environmental study of recent breeding progress on nitrogen use efficiency in wheat (Triticum aestivum L.). Theor Appl Genet. 2013;126(12):3035–48.

    Article  Google Scholar 

  12. Kant S, Bi YM, Rothstein SJ. Understanding plant response to nitrogen limitation for the improvement of crop nitrogen use efficiency. J Exp Bot. 2011;62(4):1499–509.

    Article  CAS  Google Scholar 

  13. Amiour N, Imbaud S, Clément G, Agier N, Zivy M, Valot B, et al. The use of metabolomics integrated with transcriptomic and proteomic studies for identifying key steps involved in the control of nitrogen metabolism in crops such as maize. J Exp Bot. 2012;63(14):5017–33.

    Article  CAS  Google Scholar 

  14. Gao K, Chen F, Yuan L, Zhang F, Mi G. A comprehensive analysis of root morphological changes and nitrogen allocation in maize in response to low nitrogen stress. Plant Cell Environ. 2015;38(4):740–50.

    Article  CAS  Google Scholar 

  15. Banerjee BP, Joshi S, Thoday-Kennedy E, Pasam RK, Tibbits J, Hayden M, et al. High-throughput phenotyping using digital and hyperspectral imaging-derived biomarkers for genotypic nitrogen response. J Exp Bot. 2020;71(15):4604–15.

    Article  CAS  Google Scholar 

  16. Obata T, Witt S, Lisec J, Palacios-Rojas N, Florez-Sarasa I, Yousfi S, et al. Metabolite profiles of maize leaves in drought, heat, and combined stress field trials reveal the relationship between metabolism and grain yield. Plant Biol. 2015;169(4):2665–83.

    CAS  Google Scholar 

  17. Casa AM, Pressoir G, Brown PJ, Mitchell SE, Rooney WL, Tuinstra MR, et al. Community Resources and Strategies for Association Mapping in Sorghum. Crop Sci. 2008;48(1):30–40.

    Article  Google Scholar 

  18. Ge Y, Atefi A, Zhang H, Miao C, Ramamurthy RK, Sigmon B, et al. High-throughput analysis of leaf physiological and chemical traits with VIS-NIR-SWIR spectroscopy: a case study with a maize diversity panel. Plant Methods. 2019;15(1):1–12.

    Article  CAS  Google Scholar 

  19. de Jong M, Tavares H, Pasam RK, Butler R, Ward S, George G, et al. Natural variation in Arabidopsis shoot branching plasticity in response to nitrate supply affects fitness. PLoS Genet. 2019;15(9): e1008366.

    Article  Google Scholar 

  20. Thurber CS, Ma JM, Higgins RH, Brown PJ. Retrospective genomic analysis of sorghum adaptation to temperate-zone grain production. Genome Biol. 2013;14(6):1–13.

    Article  Google Scholar 

  21. Miao C, Xu Y, Liu S, Schnable PS, Schnable JC. Increased power and accuracy of causal locus identification in time series genome-wide association in sorghum. Plant Physiol. 2020;183(4):1898–909.

    Article  CAS  Google Scholar 

  22. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Methodol. 1995;57(1):289–300.

    Google Scholar 

  23. Sheflin AM, Chiniquy D, Yuan C, Goren E, Kumar I, Braud M, et al. Metabolomics of sorghum roots during nitrogen stress reveals compromised metabolic capacity for salicylic acid biosynthesis. Plant Direct. 2019;3(3): e00122.

    Article  Google Scholar 

  24. Sun G, Wase N, Shu S, Jenkins J, Zhou B, Chen C, et al. The genome of stress tolerant crop wild relative Paspalum vaginatum leads to increased biomass productivity in the crop Zea mays. bioRxiv. 2021.

  25. Maurino VG, Peterhansel C. Photorespiration: current status and approaches for metabolic engineering. Curr Opin Plant Biol. 2010;13(3):248–55.

    Article  Google Scholar 

  26. Xu G, Lyu J, Obata T, Liu S, Ge Y, Schnable JC, et al. A historically balanced locus under recent directional selection in responding to changed nitrogen conditions during modern maize breeding. bioRxiv. 2022.

  27. Liland KH, Mevik BH, Wehrens R. pls: Partial Least Squares and Principal Component Regression. 2021. R package version 2.8-0. Accessed 7 Oct 2022.

  28. Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28(1):1–26.

    Google Scholar 

  29. Tsugawa H, Cajka T, Kind T, Ma Y, Higgins B, Ikeda K, et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat Methods. 2015;12(6):523–6.

    Article  CAS  Google Scholar 

  30. Sumner LW, Amberg A, Barrett D, Beale MH, Beger R, Daykin CA, et al. Proposed minimum reporting standards for chemical analysis. Metabolomics. 2007;3(3):211–21.

    Article  CAS  Google Scholar 

  31. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria; 2021. Accessed 7 Oct 2022.

  32. Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, et al. Welcome to the tidyverse. J Open Source Softw. 2019;4(43):1686.

    Article  Google Scholar 

  33. Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. J Stat Softw. 2015;67(1):1–48.

    Article  Google Scholar 

  34. Brommer JE. Variation in plasticity of personality traits implies that the ranking of personality measures changes between environmental contexts: calculating the cross-environmental correlation. Behav Ecol Sociobiol. 2013;67(10):1709–18.

    Article  Google Scholar 

  35. Lê S, Josse J, Husson F. FactoMineR: A Package for Multivariate Analysis. J Stat Softw. 2008;25(1):1–18.

    Article  Google Scholar 

  36. Wright MN, Ziegler A. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J Stat Softw. 2017;77(1):1–17.

    Article  Google Scholar 

  37. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33(1):1–22 Accessed 7 Oct 2022.

Download references


Not applicable.


This project was supported by U.S. Department of Energy, Grant no. DE-SC0020355 to JCS, the National Science Foundation under grant OIA-1826781, USDA-NIFA under the AI Institute: for Resilient Agriculture, Award No. 2021-67021-35329 and the Foundation for Food and Agriculture Research Award No. 602757. This project was enabled by the work of the the Proteomics & Metabolomics Facility (RRID:SCR_021314) of Nebraska Center for Biotechnology at the University of Nebraska-Lincoln, utilizing instrumentation are supported by the Nebraska Research Initiative.

Author information

Authors and Affiliations



J. C. S., M. W. G., Y. G. and S.A. conceived and designed the study. M. W. G., M. Z., H. J., N. K. W., A. A., M. N. and S. A. collected data. M. W. G. analysed the data with guidance from J. C. S. and S. A. M. W. G and J. C. S. wrote initial draft of manuscript with input from S. A. All authors approved the final version of the manuscript.

Authors’ information

Not applicable.

Corresponding authors

Correspondence to Marcin W. Grzybowski or James C. Schnable.

Ethics declarations

Ethics approval and consent to participate

The collection of plant material complies with relevant institutional, national, and international guidelines and legislation. This study did not require ethical approval or consent, plant samples used in the study were not included in the list of national protected plants and not collected from national park or natural reserve. The permission was obtained from University of Nebraska-Lincoln for the collection of sorghum samples.

Consent for publication

Not applicable.

Competing interests

James C. Schnable has equity interests in Data2Bio, LLC; Dryland Genetics LLC; and EnGeniousAg LLC. He is a member of the scientific advisory board of GeneSeek and currently serves as a guest editor for The Plant Cell. The authors declare no other conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary file 1:

Table S3. and Figures S1-10.

Supplementary file 2:

Table S1. Plot-level values of morpho-physiological traits.

Supplementary file 3:

Table S2. Raw ground truth and spectral data used to build PLSR models and predict three traits.

Supplementary file 4:

Table S4. Summary statistics for 145 annotated metabolites.

Supplementary file 5:

Table S5. Correlation values between yield and 145 metabolites.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grzybowski, M.W., Zwiener, M., Jin, H. et al. Variation in morpho-physiological and metabolic responses to low nitrogen stress across the sorghum association panel. BMC Plant Biol 22, 433 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: