Genomic prediction accuracy for switchgrass traits related to bioenergy within differentiated populations

Background Switchgrass breeders need to improve the rates of genetic gain in many bioenergy-related traits in order to create improved cultivars that are higher yielding and have optimal biomass composition. One way to achieve this is through genomic selection. However, the heritability of traits needs to be determined as well as the accuracy of prediction in order to determine if efficient selection is possible. Results Using five distinct switchgrass populations comprised of three lowland, one upland and one hybrid accession, the accuracy of genomic predictions under different cross-validation strategies and prediction methods was investigated. Individual genotypes were collected using GBS while kin-BLUP, partial least squares, sparse partial least squares, and BayesB methods were employed to predict yield, morphological, and NIRS-based compositional data collected in 2012–2013 from a replicated Nebraska field trial. Population structure was assessed by F statistics which ranged from 0.3952 between lowland and upland accessions to 0.0131 among the lowland accessions. Prediction accuracy ranged from 0.57–0.52 for cell wall soluble glucose and fructose respectively, to insignificant for traits with low repeatability. Ratios of heritability across to within-population ranged from 15 to 0.6. Conclusions Accuracy was significantly affected by both cross-validation strategy and trait. Accounting for population structure with a cross-validation strategy constrained by accession resulted in accuracies that were 69% lower than apparent accuracies using unconstrained cross-validation. Less accurate genomic selection is anticipated when most of the phenotypic variation exists between populations such as with spring regreening and yield phenotypes. Electronic supplementary material The online version of this article (10.1186/s12870-018-1360-z) contains supplementary material, which is available to authorized users.


Background
The ability to accurately predict performance and breeding values of crops and livestock based on large numbers of genetic markers in the absence of phenotypic data has been the focus of intense research efforts and is now being applied in advanced breeding programs [18,35,56]. This approach, known as genomic selection (GS), would be useful, in theory, for perennial grasses such as switchgrass that require 2-3 seasons to determine their full yield potential. In dairy cattle, the generation interval has been reduced from 4 to 7 y to approximately 2.5 y, allowing annual rates of genetic gain to increase from5 0-400% depending on the heritability of the trait [18]. In crops such as maize and wheat, predictions show that GS accuracies as low as 0.2 can achieve greater annual genetic gains than marker-assisted selection breeding schemes [24]. In perennial species, annual genetic gains and cost per unit of gain are also predicted to be higher with GS [20,48,64]. This can be the case with switchgrass (Panicum virgatum L.) if genomic estimates of breeding values (GEBV) are sufficiently accurate and the generation interval can be shortened.
Switchgrass is an attractive, perennial biomass crop for the US, due to its favorable net energy yield as well as several environmental and economic advantages [51]. Switchgrass is open pollinated and highly self-incompatible with ample genetic diversity [53]. The genetic variation between cultivars can differ 3-5 fold for biomass [54], 10% for lignin content [4], and 85% for water-use efficiency [28]. Some switchgrass lines reach heights of more than 3 m, and have vigorous root systems that may extend to depths of more than 3.5 m [41]. Phenotypic variation in many traits is sufficient for genetic analysis and breeding. Breeding programs have focused on recurrent selection of synthetic cultivars from wild switchgrass populations, targeting traits such as (1) high biomass yields, (2) improved seedling establishment, and (3) increased feedstock quality [3]. In the upper southeastern US, switchgrass biomass yields average more than 14 Mg ha − 1 [17], and in some locations more than 25 Mg ha − 1 , suggesting greater yield potential with further improvement by breeding.
Defining the type of selection scheme and relative advantage of implementing GS in forages, it was found that under certain scenarios GS could be advantageous over phenotypic selection. For allogamous species such as switchgrass the GS advantage can result in increases of up to 2.6 fold [52].
To date, genomic prediction in switchgrass has been attempted in panels of northern-adapted upland and lowland germplasm and in two populations of half-sib families, one upland and one derived from the Liberty cultivar [5,31,47]. These reports employed a variety of different modeling and genotyping methods and represented different population structures and unique environments. Reported accuracies have been from over 0.5 to negative ranges depending on the trait, environment, and composition of the calibration set (CS) and validation set (VS).
The composition of the CS has been shown to be critical for improving the accuracy of GS, particularly under situations where there is population structure [26]. Population structure can produce false associations in genome-wide association studies [45,66]. It can introduce upward bias to estimates of heritability [58], and can introduce bias into genomic prediction [32,49,65]. Capturing as much phenotypic variation as possible in the CS is desirable [26]. However, significant population structure should be accounted for when evaluating performance of the CS and genomic prediction accuracies decline as the relationship between CS and VS becomes weaker [23].
In this study, the impact of population structure on estimations of genomic heritability and on genomic prediction was assessed using both compositional and morphological traits in switchgrass related to bioenergy. The genetic trials were conducted with space planted populations with unknown pedigree relationships. These populations were derived from both upland and lowland ecotypes and therefore represent high diversity. Four different methods for genomic prediction were tested and their performance compared. Heritability was partitioned into across and within-population components that may have specific contributions to prediction accuracy.

Plant populations
Experiments were established from greenhouse-grown seedlings in 2009 in the field at the Eastern Nebraska Research and Extension Center (ENREC) near Mead, NE. One hundred twenty-five seedlings were planted for each of five populations in 5 rows of 25 seedlings on 1.1 m centers. Populations were randomly assigned to plots and genotypes within populations were randomly assigned to individual rows and positions within the plots. No fertilizers were applied in the establishment year while in post-establishment years 112 kg ha − 1 nitrogen was applied in the spring as NH 4 NO 3 . In establishment and post-establishment years herbicides and hand-weeding were used to control annual weeds. Atrazine (6-chloro-N-ethyl-N-isopropyl-1,3,5,-triazine-2,4-diamine), 2,4 -D (2,4-dichlorophenoxyacetic acid) and S-metolachlor [2-chloro-N-(2-ethyl-6-methylphenyl)-N-(2-methoxy-1-methyl) acetamide] were applied at rates of 1.96, 1.48, and 1.26 kg ha − 1 respectively once in late spring or early summer. In 2011 two ramets were dug from each genotype in the original plots and used to establish two additional randomized plots per population, keeping the populations separate blocks. Residual plot stubble was burned in April 2011 and again in April 2012 prior to the onset of spring regrowth.
The five populations used in this study were either obtained from NRCS Plant Material Centers or were harvested from isolated polycross blocks near Mead, NE which were maintained at least 500 m from other switchgrass fields. All populations were assumed to be tetraploid based on breeding history. A summary of the populations is presented in Table 1. Kanlow (K) is a released tetraploid lowland cultivar with high yield. Kanlow N1 (N1) has been previously described [59] and represents a population that had survived 4 years without winter kill in Nebraska. Summer (S) is a tetraploid, upland cultivar adapted to more northern latitudes with good winter survival. The seed lot used to establish our Summer population represented a bulk harvest of a selection nursery comprised of 1000 individuals with high yield and good vigor. The Kanlow N1-EM (N1EM) population is an early flowering population of plants selected from the Kanlow N1 base. The Kanlow x Summer F2 (KxS) population represents a population two generations removed from the initial randomly intercrossed parental populations in which Kanlow was used as the male parent.

Phenotypic analysis Spring emergence
Spring emergence (GRN) was determined in 2013 as the day of year (DOY) when green shoots visibly emerged from winter dormancy.

Yield
All plots were harvested on a single clone basis after a killing frost between November 2 and 7, 2012 using a Carter Harvester (Carter Manufacturing Co., Brookston, IN) set at a cutting height of 10 cm. Fresh weights of individual clones were determined and subsamples were taken to determine percent moisture and provide an estimate of dry weight. Yield (YLD) was reported as dry weight corrected for the weight of the subsample.

Anthesis
The anthesis date (ANT) was determined as the day of year when 50% of the mature tillers of each clone reached the anther emergence/anthesis R4 stage [39].

Near infrared spectroscopy data
Harvested samples were dried in a forced air oven at 60°C. These samples were then ground in a Wiley mill to pass a 2 mm test sieve and later in a Cyclone Mill to pass a 1 mm test sieve. Ground samples were scanned using a Model 6500 near-infrared spectrometer (FOSS NIRSystems, Inc., Laurel, MD). Estimates of over 20 biomass components, including theoretical yields of simple sugars and ethanol, were obtained from broad-based near infrared spectroscopy (NIRS) calibrations. Descriptions of the traits can be found in Table 2 and in more detail in [61].

Phenotypic data analysis
Estimates of best linear unbiased predictors (BLUP) values for each trait across replicates, genotypic variance (σ G 2 ), error variance (σ e 2 ), and repeatability (H 2 ) were obtained using the restricted maximum likelihood method in the "lme4" package [2] in R (R Core Team). Broad-sense heritability, equivalent to line-based repeatability across plots, was estimated from the variance components obtained with the BLUP model: where n refers to the average number of replicates per genotype. Data was centered and scaled by the standard deviation prior to analysis to allow simpler comparisons among methods and traits and some outliers were removed.

Genotyping
Leaf tissue samples from the original plots prior to establishing replicates were collected and genomic DNA was extracted using a CTAB method [6]. Genotypingby-sequencing library production followed a previously established protocol [13] with a few modifications. Briefly, gDNA was digested with PstI and ligated to Illumina adapters with a custom set of sequence index barcode tags. Barcoded DNA was subsequently adapted with Illumina sequencing primers and purified with a Qiagen PCR cleanup kit. DNA libraries were pooled (96 samples per flow-cell land) and single-end reads were generated on an Illumina HiSeq 2000 at the Vincent J. Coates Genomics Sequencing Laboratory, University of California, Berkeley. Sequencing generated 1.25 × 10 9 100-bp single-end reads that were fed to the TASSEL-GBSv2 pipeline [19] and generated a total of 810 million 64-bp sequence tags of which 721 million could be matched to the master tag database aligned to version 4.0 of the switchgrass genome assembly (https://www.phytozome.jgi.doe.gov) using BWA [30]. For the alignment, only the 18 main chromosomes were used. Prediction and annotation of single nucleotide polymorphism (SNP) effects were performed using SNPEff [9]. The number of tags across individual taxons ranged from 13,777 to 5,104,802 with an  average of 1,229,285. TASSEL detected a total of 70,215 SNP sites with 46% of potential genotype calls present. SNP filtering was then performed. First, genotypes with a depth of 1 were set to missing which eliminated 2.33% of the individual SNP calls. Second, individuals were filtered out if they contained missing information for > 75% of the SNP positions. This resulted in the loss of 104 individuals (16% of the population). Third, sites with a minor allele frequency of < 0.025 and with genotype calls in fewer than 50% of all individuals were eliminated. Fourth, sites in complete LD with other sites were pruned. After applying these filters, a total of 19,342 SNPs remained across 483 individuals with 71% of the potential genotype calls present.

Genetic differentiation and genomic heritability estimates
For evaluating genomic selection methods and heritability imputation of missing genotypes was necessary and was conducted using the LD-kNNi algorithm where the l sites most in LD with the site to be imputed are used to determine the nearest neighbors and the weightings to be used when imputing [38]. In the imputed dataset the parameters k = 13, l = 8 were applied. Accuracy was determined by masking and imputing 10,000 known genotypes and was estimated at 88.5%.
Population differentiation was measured on the imputed dataset with the F ST statistic using the method of Weir and Cockerham [62]. Individual clustering by principal components analysis was performed with the genotypic data for the purposes of visualizing population structure [33].
For the purposes of predicting genomic breeding values (GEBV) and estimating genomic heritability within and across populations, additive genetic variance (σ G 2 ) and residual variance (σ e 2 ) were derived from applying ridge-regression mixed modeling kinship-BLUP used a restricted maximum likelihood (REML) approach implemented in the R package rrBLUP version 4.5 [14,46]. The estimate of the realized additive relationship matrix was calculated as: where W is the centered (n x m) genotype matrix and p is the allele frequency [15]. The basic model was used where X is a full-rank design matrix for the fixed effects (β), Z is the design matrix for the random effects u, K is the additive relationship matrix A, and e is for the residuals (e~N (0, Iσ e 2 )) with I being the identity matrix.
The matrix K captures population structure due to admixture, genetic drift, and selection history as well as kinship. Here, marker-based estimates of heritability across and within subpopulations (h g 2 , h gA 2 , h gW 2 ) were derived based on the BLUP phenotypes and on orthogonal decomposition of the genetic variance components by reparameterizing the basic model above. This was accomplished in a manner similar to that of Guo et al. [21] based on initial work of [12,27]. After eigenvalue decomposition of K as where U is an n x (n-1) matrix of the eigenvectors of K with U i the column I (i = 1,2, …, n-1) representing the individual principal component loadings, and D is an (n-1) x (n-1) diagonal matrix with each diagonal element representing eigenvalues of K, arranged in decreasing value λ 1 , λ 2 , . λ n -1 . The basic model is then rewritten as and α becomes an (n-1) × 1 vector of random genetic effects that have a normal distribution N (0, Dσ G 2 ) with the principal components used as random variables that has the same distribution as the basic model yet allows separation of subpopulation structure explained by the dominant principal components (d) into acrosspopulation genetic variance, and within-population genetic variance, The total genetic variance was calculated as Corresponding marker-based trait heritability were obtained as [27]:
Prediction accuracy r (correlation of the VS GEBV with their BLUP values) was assessed for all methods using modified 5-fold cross-validation (CV) where the VS was constructed from specific populations or combinations of populations and divided into 5 sets that each comprised a single fold. The calibration set (CS) then consisted of the remaining individuals from all populations. Initial broad screening of all traits using 5-fold CV across the entire dataset ("All") enabled us to focus on traits with higher prediction accuracy. In general, analysis was limited to those 22 traits with prediction accuracies greater than 0.5 with the kin-BLUP method.
Each analysis, except for BayesB, was conducted using 20 replications of a 5-fold CV approach to estimate prediction accuracy within a single population or combination of populations. The subjects were divided at random into five sets. For each set, the phenotypic BLUP values (y) were sequentially used as a validation set (VS) by masking and predicting the valuesŷ based on the model parameters estimated with the other four sets and with inclusion of other genotypes from outside the subject group that served as the VS. The CS always included all individuals not included in the VS.
Accuracy was determined from the average phenotypic correlation rðŷ 2 ; y 2 Þ between the true phenotype (y) and GEBV ðŷÞ across folds and replications. The mean squared error (MSE) was estimated from 1 n P ðy−ŷÞ 2 where n is the number of individual VS genotypes used for each fold. Reported MSE values were averaged across folds and replicates.

Partial least squares method
The partial least squares method (PLS) introduced by Wold [63] decomposes the genotypic data W in orthogonal scores T called latent variables and loadings P that are obtained by projecting the predicted variables and the observable variables to a new space and regressing y not on W but on the first a columns of the scores T. The latent variables are obtained individually in an iterative process which are then used as new variables of a linear regression. This allows to account for collinearity among the predictor variables and instances with many more predictor variables than observations. The PLS method was implemented with the R packages 'pls' [36]. In this specific instance the first 15 latent variables were used for prediction.

Sparse partial least squares method
The sparse partial least squares method (SPLS) [29], like PLS, allows selection of a sparse set of explanatory variables and dimensionality reduction while maintaining good predictive ability. In the SPLS case, this is done through imposing constraints onto the explanatory variables that are included in the loadings P. The constraints are imposed through a threshold parameter λ 1. Values for this parameter as well as the optimal number of latent variables to use for each trait were obtained through a cross-validation approach. SPLS prediction was implemented using the 'SPLS' package [7,8].

BayesB method
This method allows a portion (π) of the markers to have large effects while most markers have no effects. Marker variance is unique for each marker. Marker effects were estimated with Monte Carlo Markov Chan simulations. The BayesB model was run with the 'BGLR' package in R that employs a Gibbs sampling algorithm [11,35]. A value of 0.05 for π was used with 5000 iterations and a 1500 iteration burn-in period and a sampling frequency of 5.

Read distribution and SNP diversity
The sequence tags aligned to a total of 148,557 unique genome loci or approximately 104 Mbp − 1 , of which 19,769 (13%) overlapped or intersected with annotated gene regions. This is comparable to the 15% of the total switchgrass genome annotated as gene region and demonstrates little ascertainment bias using GBS. Within these sequences, 19,342 SNPs were identified after filtering. These were primarily biallelic and are summarized in Table 3. Of the 19,342, 52.1% were transitions and 44% were transversions (TsTv = 1.183), 0.8% were single nucleotide indels, 3.6% were triallelic, and 0.1% were tetra-allelic. On average, there were 14.84 SNP Mbp − 1 detected with a strong significant relationship between the total number of SNPs and the annotated length of the chromosome (P < 0.001, Adjusted R 2 = 0.7791), and between SNP density and distance to the end of the chromosomes (P < 0.001, Adjusted R 2 = 0.7147). A total of 5118 SNPs were annotated in coding sequences with 65% of the non-reference alleles caused by either missense or nonsense mutations, 2216 were in introns in gene regions, 786 were annotated in the 3' UTR region, and 670 were annotated in the 5' UTR region.

Differentiation of the populations
Population means and repeatability values for whole plant measurements and NIR traits are shown in Table  2. MANOVA tests demonstrated significant influences of population of origin on both morphological and NIR-compositional data. Population yield averages ranged from 0.6 kg dry matter YLD for Summer to 1.47 kg for Kanlow. Timing of anthesis and spring emergence were different for each population. Anthesis ranged from July 20 (202 DOY) for Summer to September 2 (246 DOY) for Kanlow while spring regreening ranged from May 10 (130 DOY) for Summer to May 17 (137 DOY) for Kanlow. Individually, the Pearson product-moment correlation coefficient r between YLD and GRN was 0.22 while r between YLD and ANT was 0.43 (Additional file 1: Figure S1). There were also significant positive correlations within two groups of NIR-estimated cell wall properties (Additional file 1: Figure S2). These two groups were: (1) GLC, KL, ADL, NDF, DM, FUC; and (2) N, IVDMD, MAN, UA, RHA, ARA, GAL. There were significant negative correlations between these two groups of properties. Repeatability which, in this case, represents an upper limit of broad-sense heritability ranged from 0.94 for ANT DOY to 0.34 for etherified ferulates (FETH).
Population fixation statistics (F ST ) were evaluated over all SNPs weighted by the population size and are reported in Table 4. Most of the genetic variation that can be explained by population structure is due to differences between the K, N1, and N1EM populations (as a group) and the S populations (F ST~0 .39-0.40). The K, N1, and N1EM populations were all closely related (F ST 0.01-0.03) to each other. The hybrid-based KxS population share more similarities with both S and K-based populations (F ST~0 .13-0.16) than these were to one another,. This indicated that there was significant genetic differentiation between the lowland-derived (K, N1, N1EM) and the upland (S) populations.
Comparing the K, N1, and N1EM populations as a group with the S population, no fixed differences (d f ) were found whereby the lowland populations would be homozygous for one allele and the upland population homozygous for another. However, 3715 sites were fixed in either group and segregating in the other.
Before performing the genomic predictions, the missing data was imputed using a k-nearest neighbor imputation method (LD-kNNi) for unordered markers. This approach uses pair-wise LD information from the markers that are the most tightly linked (i.e. in LD) in Missense, silent, and nonsense were relative to reference AP13 genotype the genotypes to be imputed. Imputation resulted in 3,139,549 genotypes being added or 28% of the dataset. Prediction accuracy of the missing data was estimated to be 88.5% based on masking and imputing 10,000 known genotypes.
Inference of population structure in the five populations based on PCA of the molecular data showed that the first three principal components accounted for 4.6-22.4% of the variation to jointly explain 40.4% (Fig. 1a). The populations can be completely resolved by the first PC (Fig. 1b, c) into three clusters represented by the lowland accessions (K, N1, and N1EM), the upland accession (S), and the hybrid accession (KxS) that is positioned between the two other groups. This result agrees with the group means of many of the phenotypes that indicate significant differences between these three populations (Table 2 and Additional file 1: Figure S3) as well as with the populations fixation statistics. Previous studies have also reported genetic differentiation between upland and lowland populations [16,37,40].
The first three PC were used to assess the impact of population structure on genomic heritability (h g 2 ) for the entire data set by orthogonal decomposition of the genetic variance components. Separate genomic heritability estimates (h g 2 , h gA 2 , and h gW 2 ) for selected traits are presented in Table 5 and the variance estimates are provided in Additional file 1: Table S5. Estimates for h g 2 ranged from 0.57 for ANT and SC to 0.15 for YLD. The highest estimate for h gA 2 was 0.43 for ANT while the lowest was 0.14 for ASH and YLD. More important for the accuracy of genomic selection applications, h gW 2 estimates ranged from 0.01 for YLD to 0.31 for NSC. The h gW 2 for ASH represented 62% of the total while the h gW 2 for NSCE represented 60% of the total. Collectively, the phenological traits (ANT and GRN) and YLD were most influenced by population structure while the NIR-compositional traits were influenced to a variable degree. The h gA 2 for NIR compositional traits ranged from 38 to 74% of total heritability.

Genome-wide patterns of linkage disequilibrium
The accuracy of genomic selection is dependent on the degree of marker disequilibrium with underlying QTL for traits of interest. Marker data was used to model disequilibrium with potential QTL loci and calculated pairwise r 2 for all markers positioned on the same chromosome. These were then sorted by distance and the results for the 0-1 kb distance interval are plotted in  Filtering the data, based on criteria including depth of sequence coverage required per genotype, data-missingness, or minor-allele frequency, had nominal effects on r 2 values and resulted in loss of information as SNP sites were filtered out. Non-linear curve fitting using an exponential distribution was used to estimate the distance threshold for an r 2 = 0.2. Changing the sequence depth required to call a genotype from 2 to 8 resulted in an approximate 6% increase in the threshold distance. Changing the amount of missing data allowed from 25 to 50% resulted in a 19% increase in threshold distance, while filtering sites based on minimum minor allele frequency from 0.05 to 0.20 resulted in a 380% increase in threshold distance. However, these changes only resulted in a small overall increase in threshold to 76 bp. The threshold for the S population was 69 bp while that for the combined lowland accessions (Low) was 58 bp. The KxS population had a lower threshold than either the S or K populations.
Because r 2 is known to vary across the genome we attempted to look at trends across single chromosomes, but the data was too sparse to detect these variations reliably. Overall the threshold value ranged from 33 bp (Chr7N) to 82 bp (Chr3N) across all 18 chromosomes. Threshold values were not significantly different between the N and K subgenomes.

Genomic prediction
Prediction accuracy results are summarized in Table 6 and Additional file 1: Tables S1-S4. In all cases, due to population structure, GEBVs were more accurate when the VS was sampled across the entire dataset (unconstrained). The highest accuracy using this CV method was obtained for Flowering DOY (ANT) using kin-BLUP (r = 0.88) while the following traits (FEST, HEX, PCA, PPEN, GRN, IVDMD, FAT, SUC, YLD, SC, GLCS, ETOH, HEXE, UA, NSC, FRU, ASH, AX, HEXEP, PSOL) were above 0.5 under the same conditions. When using a CV method where the VS was constrained to consist exclusively within one of either K, N1, N1EM, S, or KxS populations, ANT accuracy ranged from 0.46 in the N1 population to 0.06 in the K population and across different traits r ranged from 0.57 for Fructose (FRU) (MSE = 0.18) in the KxS population to − 0.23 (MSE = 0.05) for GRN in the KxS population. On average, across all traits and populations, accuracy was reduced 69.6% from the unconstrained CV strategy using kin-BLUP and similarly for the other prediction methods. GEBVs were slightly more accurate when the "Low" population was used as a VS rather than the individual K, N1, or N1EM populations and accuracies were reduced only 66.9% from the unconstrained CV strategy.
The influence of increasing SNP density as well as the training population size on prediction accuracy was analyzed as before using kin-BLUP with 5-fold CV across all individuals for three test traits: IVDMD, YLD, and ANT. The results are shown in Fig. 3a and b. For these traits    increasing the number of SNPs beyond 50 that were included in the CS improved accuracy 50.0-62.2% with little improvement seen beyond 3000 SNP. The effect of increasing the number of individuals comprising the CS from 40 to 386 resulted in a 7.6-10.1% improvement in prediction accuracy and a decrease in its standard deviation across replicates and folds. When the CV approach was unconstrained by population different regression techniques produced very Table 6 Genomic prediction accuracy (r) for indicated traits using kin-BLUP, partial least squares (PLS), sparse partial least squares (SPLS), and BayesB (BB) regression methods and different cross-validation strategies (Continued)  . When the CV strategy was constrained by individual population or within the lowland ("Low") populations collectively, the results among regression techniques were more variable. The greatest variation among methods occurred for the trait·population combinations where maximum prediction accuracy was below 0.3. However, the coefficient of variation was greater than 20% between methods for some trait·population combinations with accuracies above 0.3. These included the NSCE·S, PPEN·KxS, NSC·S, PCA·Low, GRN·S, and SC·S combinations. In these cases, SPLS or PLS were the most reliable prediction methods, while BB was the least reliable. Analysis of variance was used to quantify the influences of trait, population, and prediction method on both accuracy and bias of GEBV measurements (Additional file 1: Table S6). GEBV prediction accuracy was heavily influenced by both CV strategy and trait and their second-order interactions with prediction method. The influence of prediction method alone was only suggestive of a significant effect. Bias was also impacted to a large degree by population and trait and their interaction effects while the second-order interaction of trait x prediction method was only suggestive of some influence.

GBS
Genotyping by sequencing approaches have been used with varying numbers of loci and different genome coverages for GS purposes in many species. Previously in switchgrass, both reference based and non-reference based SNP calling methods have been applied to GBS data and these methods have provided evidence that marker discovery and GS breeding approaches may be both cost-effective and accurate [5,31,44]. In this study, the ability to call SNP accurately was facilitated with the TASSEL-GBSv2 pipeline. Numbers of tags and SNPs obtained using this method were comparable to other species such as oil palm (Elaeis guineensis; 19,432 SNPs) and wheat (Triticum aestivum; 34,749) [10,43]. The tags and SNPs aligned unevenly across the reference genome with more located toward the ends of each chromosome and with no significant bias toward either sub-genome. Using PstI with a recognition sequence of 'CTGCAT' to assemble the libraries may have resulted in fewer markers aligned to repetitive elements of low GC content and in more markers associated with the coding or regulatory regions of the reference sequence located in subtelomeric regions. This uneven distribution is difficult to separate from the current state of the available switchgrass reference genome which is fragmentary, but consistent with known gene distribution in other grass species such as sorghum (Sorghum bicolor; [42]).

Population structure
Many of the trait phenotypes and the molecular variation displayed moderate to severe population structure. This was expected given the breeding history and origination of these lines. The lowland populations K, N1, and N1EM were less differentiated among themselves. Large differences in the estimation of genomic heritability occurred depending on population and trait. Except for a few traits (PSOL, NSC, ASH, FAT), most of the heritability existed across populations rather than within populations. This contrasts with some findings in maize (Zea mays) and rice (Oryza sativa) populations where most heritability could be explained within populations [21].
Accuracy and reliability of genomic predictions, have been shown to decline as the distance between validation and training set increases [26]. In order to avoid this situation, random sampling combined with a large CS enabled capturing most of the existing diversity in the CS across populations. Others have dealt with this issue by stratified sampling across populations to a b Fig. 3 Differing numbers of SNPs or numbers of individuals in the kin-BLUP model. SNPs and individuals were randomly sampled at the indicated levels and then used to construct the realized relationship matrix. Error bars: ± standard deviation of predictive ability (accuracy). a Different numbers of SNPs using all available individuals from all populations. b Random subsampling of the numbers of individuals using all available SNPs optimize the calibration set or have used criterion such as reliability [57], prediction error variance, or coefficients of determination [1,26,50]. These choices are particularly relevant within a breeding program and depend on the interaction of trait architecture and degree of structure present.

Genomic prediction
The repeatability of measured traits was between 0.34 and 0.94. A strong positive relationship (R 2 = 0.7; data not shown) exists between repeatability and GEBV accuracy and more focus was given to the 21 traits that had both reasonable repeatability and accuracy above 0.5 for further analysis using different prediction approaches. These traits included total hexoses and ethanol potential, which are both important parameters for bioenergy, as well as IVDMD which is predictive of daily weight gains in cattle. Some traits that resulted in poor GEBV accuracy included CAL, GAL, N, STA, MAN, FUC, GLC and DM. One or more of these traits may have had better accuracy using a restricted VS. However, these traits were not examined due to the poor overall accuracy. Although yield was included in this study, yield data were collected in a spaced-planting trial whic is not an accurate indicator of yield in densely-planted (sward) plots with higher plant competition and mortality rates [5,44]. This fact complicates genomic selection strategies that include yield, but even including progeny testing in sward plots, GS breeding strategies can still shorten the generation interval. Prediction accuracy for switchgrass yield in swards has been reported to be between 0.145 and 0.237 depending on the genotyping platform [5]. Similar to previous examples of genomic prediction based on GBS [10], reducing the number of SNP markers used in analysis (to approximately 3000 in this case) did not adversely affect predictive ability when the VS was unconstrained among all populations. The accuracy of GEBVs using kin-BLUP increased up to the inclusion of 3000 SNP markers for YLD, ANT, and IVDMD. This could be due to the ability of relatively few markers to adequately capture the relationships between populations, as well as the relatively close relationships of individuals within each population. Higher marker coverage would be necessary to adequately capture more genetic variance associated with QTL in common between populations. Indeed, predictive ability within individual populations was poor in many cases and average inter-marker interval was much greater than the distance threshold for LD (r 2 ≤ 0.2). LD was found in this study to decline rapidly over short distances < 1000 bp. This rapid LD decay could be explained by the outcrossing nature of switchgrass [34], large effective population size, high recombination rates or other mechanisms [22]. These rates were similar to those reported for US inbreds and landraces in Zea mays [55] and in Picea abies [25]. Increasing the number of individuals included in the kin-BLUP analysis from 50 to 483 led to moderate, linear increases in average prediction accuracy with a lower standard deviation between replicates using 5-fold cross validation.
Differences between prediction methods were of interest as there are very few empirical examples of direct comparisons among these methods using actual data rather than simulated data. The PLS method employs dimensionality reduction while the sparse PLS combines both dimensionality reduction and variable selection. BayesB also uses variable selection as it allows some markers to have large effects and most to have a genetic variance of 0. Previously, results with switchgrass showed no discernible performance differences with three different prediction methods [31]. This study has shown also that the methods were overall very similar in prediction abilities across different traits and VS. Variation between methods was observed relative to trait·population combinations with lower accuracies. These combinations had higher standard deviations between replicates and across folds, so in large part this variation is likely to represent the random error component. However, several trait·population combinations stood out. These included the estimated ethanol from non-structural carbohydrates (NSCE), soluble glucose (GLCS), sucrose (SUC), and non-structural carbohydrates (NSC) particularly in the Summer population where the PLS method performed substantially better than the other methods. It is not surprising in this case that these traits should act similarly as these phenotypes were positively correlated and NSCE and NSC are calculated directly from the NIR estimates of STA, GLCS, FRU, and SUC [60]. Another combination that is interesting are the estimates of p-coumaric esters (PCA) and ester-linked ferulates (FETH). These traits are positively correlated and have the lowest values in the Summer population. The accuracies of the GEBV were relatively high when the VS consisted of any population, except Kanlow and in the specific case of PCA in Kanlow N1. This is only exceptional because, for these two traits, there appears to be no strong reason why there should be such a range of prediction accuracies across the different populations, within the same trait among the lowland populations, given they were so closely related based on PCA and F statistics.
The prediction accuracies in this set of data are comparable to those found among morphological and compositional traits in a Northern upland diversity panel [31]. In that study, some of the same traits were analyzed, the CS and VS were derived from all the lines, and the first two principal components of a PCA were used as explanatory variables to account for population structure. In the present study, the first three principle components were considered only for calculating heritability across populations while the populations were treated independently or in combination during the cross-validation step to evaluate model performance without attempting to use the PC as explanatory variables.

Conclusions
This study emphasizes the possible benefits of using genomic selection in switchgrass. There are still many improvements that can be made in both phenotyping and genotyping methods (such as efficient indirect methods to predict yield in highly competitive environments) to improve genetic gains more rapidly in switchgrass [5]. However, it is apparent that both morphological and compositional traits may be efficiently selected using indirect methods based on genotypic data alone and whole genome regression techniques. The key in the future will be to assess the cost effectiveness of these techniques in switchgrass, given the uncertainty and complexity of bioenergy production processes.

Additional file
Additional file 1: Figure S1. Correllelogram depicting positive (blue) and negative (red) correlations among whole plant traits. Color scale on right indicates Pearson correlation coefficient r. Figure S2. Correllelogram depicting positive (blue) and negative (red) correlations among wall composition traits determined by NIR. Color scale on right indicates Pearson correlation coefficient r. Figure S3. Boxplots of (a) ANT, (b) IVDMD, and (c) YLD for each population. Bottom and top of each box represent the first and third quartiles. Horizontal line represents the median, whiskers extend to the most extreme data point that is no more than 1.5 times the interquartile range from the box. Table S1. kin-BLUP regression statistics from 20 replicates of 5-fold CV. Table S2. Partial Least Squares regression statistics from 20 replicates of 5-fold CV. Table S3. Sparse Partial Least Squares Regression statistics from 20 replicates of 5-fold CV. Table S4. BayesB Regression statistics from 5-fold CV. Using 5000 iterations and a 1500 iteration burn-in period (see Methods Section). Table S5. Variance components for selected traits after partitioning based on dominant principal components 1-3.