Transcription factor encoding gene OsC1 regulates leaf sheath color through anthocyanidin metabolism in Oryza rufipogon and Oryza sativa

Carbohydrates, proteins, lipids, minerals and vitamins are nutrient substances commonly seen in rice grains, but anthocyanidin, with benefit for plant growth and animal health, exists mainly in the common wild rice but hardly in the cultivated rice. To screen the rice germplasm with high intensity of anthocyanidins and identify the variations, we used metabolomics technique and detected significant different accumulation of anthocyanidins in common wild rice (Oryza rufipogon, with purple leaf sheath) and cultivated rice (Oryza sativa, with green leaf sheath). In this study, we identified and characterized a well-known MYB transcription factor, OsC1, through phenotypic (leaf sheath color) and metabolic (metabolite profiling) genome-wide association studies (pGWAS and mGWAS) in 160 common wild rice (O. rufipogon) and 151 cultivated (O. sativa) rice varieties. Transgenic experiments demonstrated that biosynthesis and accumulation of cyanidin-3-Galc, cyanidin 3-O-rutinoside and cyanidin O-syringic acid, as well as purple pigmentation in leaf sheath were regulated by OsC1. A total of 25 sequence variations of OsC1 constructed 16 functional haplotypes (higher accumulation of the three anthocyanidin types within purple leaf sheath) and 9 non-functional haplotypes (less accumulation of anthocyanidins within green leaf sheath). Three haplotypes of OsC1 were newly identified in our germplasm, which have potential values in functional genomics and molecular breeding of rice. Gene-to-metabolite analysis by mGWAS and pGWAS provides a useful and efficient tool for functional gene identification and omics-based crop genetic improvement. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-024-04823-0.

Transcription factor encoding gene OsC1 regulates leaf sheath color through anthocyanidin metabolism in Oryza rufipogon and Oryza sativa Introduction Rice, as one of the most important cereal crops [1] in Asia and Southeast Asia, including China, is higher demanded to be improved the quality under the premise of keeping the yield.In China, Guangdong province gets ahead in rice genetic breeding owe to the rich rice germplasm and the compatible climate for rice growth and development [2].Abundant rice germplasm are the most important parent and gene resources for rice genetic breeding with three eternal themes of yield, resistance and quality [3].Semi-dwarf breeding and hybrid rice breeding, which are known as the first and the second green revolution respectively, did both benefit from exploring and utilizing the excellent rice germplasm, such as the semi-dwarf rice variety ' Aizaizhan' and abortive common wild rice [4].The demand of diversified cereals, like colored rice with high anthocyanidins accumulation, is growing for higher nutrients and people's health, while common rice grain mainly contains many kinds of nutrient substance, such as water, carbohydrates, proteins, lipids, minerals and vitamins, within little anthocyanidins.
Anthocyanidins, a class of water-soluble flavonoids, are one of the largest groups of secondary metabolites in plants.Anthocyanidins can not only give distinctive floral organs (leaves, leaf sheath, hull, awn, and so on) various colors (purple, brown, or red), but also protect people from some chronic diseases, such as cancer, cardiovascular disease (CVD), non-alcoholic fatty liver disease (NAFLD), diabetes and obesity [5][6][7][8][9].Besides, anthocyanidins play an important role in cleaning up reactive oxygen accumulated in plants upon various biotic and abiotic stress, such as ultraviolet (UV) radiation, infection by insects and pathogenic microorganism [10][11][12][13][14]. Based on the benefits of anthocyanidins, more and more biologists and breeders are committed to exploring the molecular mechanism of biosynthesis pathway and breeding new crop varieties which are rich in anthocyanidins.
Anthocyanidin biosynthesis is catalyzed by a class of enzymes, such as CHS (chalcone synthase), CHI (chalcone isomerase), F3H (flavanone 3-hydroxylase), F3'H (flavonoid 3' hydroxylase), DFR (dihydroflavonol 4-reductase), ANS (anthocyanidin synthase) and UFGT (UDP-flavonoid glucosyl transferase), and regulated by a conserved MBW (MYB-bHLH-WD40) complex utilizing phenylalanine as a substrate [15,16].In Arabidopsis thaliana, MBW complex that activates the biosynthesis of anthocyanidins in vegetative tissues is demonstrated to be consist of MYBs of SG5 and SG6, basic helix-loophelix subgroup, and WD40 repeat protein of TTG1 [16], whereas it comprising C1/Pl1 (R2R3-MYBs), R1/B1 (bHLHs), and PAC1 (WD40) in maize [17].Although more and more traits which including grain size, panicle, callus induction, mesocotyl length, chlorophyll content, stigma exsertion, cold tolerance, drought tolerance had been examined by genome-wide association study [18] which was benefit from the fast development of genomic resequencing, the research on regulation of anthocyanidin biosynthesis used by GWAS in rice [19] behind and less than that in A. thaliana and maize.In rice, five putative regulators of anthocyanidin biosynthesis were identified and characterized by comparative mapping to the homologous nucleotide sequences of known orthologues in maize, including a R2R3-MYB gene OsC1 and four bHLH genes, Ra1/OsB1, Rb, Ra2 and OsB2 [20][21][22][23][24].The R2R3-MYB gene OsC1 was demonstrated to be a determinant factor and a domestication-related gene of anthocyanidin biosynthesis in leaf sheath of cultivated rice [25,26].A 'C-S-A' gene system (OsC1-OsB2-OsDFR) was demonstrated to regulate hull pigmentation and reveal evolution of anthocyanidin biosynthesis pathway in rice [27].In a word, although a few MYB and bHLH regulators have been identified and characterized in cultivated rice with the genetic variation analysis and evolution analysis of sequences between cultivated rice and wild rice, identification and characterization of these regulators in wild rice remain to be determined.
Although cultivated rice (O.sativa) hardly contains anthocyanidins and has green vegetative organisms, common wild rice (O.rufipogon), as the ancestor of cultivated rice, shows significant higher anthocyanidin accumulation and purple vegetative organisms (leaves, leaf blade, and leaf sheath).Screening the rice germplasm within higher accumulation of anthocyanidins and researching on anthocyanidins regulation, will be benefit for cultivating the cultivated rice varieties which are rich in anthocyanidins.
In order to screen the rice germplasm with high accumulation of anthocyanidins and identify the variations of related regulator(s) in rice germplasm in Guangdong province, in this study, phenotype of leaf sheath color and accumulation of anthocyanidins were respectively used to perform phenotypic and metabolic genome-wide association study (pGWAS and mGWAS).We have screened and identified 146 of 160 (91.25%) wild rice accessions and 12 of 151 (7.95%) cultivated rice varieties showed purple leaf sheath with significant higher accumulation of anthocyanidins, which could be the parent plants for hybrid rice breeding of anthocyanidin accumulation.Additionally, a well-known MYB transcription factor encoding gene, OsC1, was functionally characterized in our rice germplasm with three (two for the common wild rice and one for the cultivated rice) potential newly variations resulted in green leaf sheath and low accumulation of anthocyanidins.Exploring the regulation of anthocyanidin biosynthesis pathway in rice leaves would add insights into understanding the anthocyanidin biosynthesis pathway in rice grains.

Plant materials and growth conditions
A collection of 311 rice accessions including 160 wild and 151 cultivated varieties (Supplementary Table S1) was used in this study.Plants were grown during the normal rice growing seasons in the field with normal agricultural practices in Hainan province, China [28].Five leaves were collected from each of five randomly chosen plants at five-leaf stage as a sample, two biological replicate samples of each accession were used for metabolic and phenotypic genome-wide association studies.

Metabolite profiling
A liquid chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS/MS) system was used for the relative quantification of widely targeted metabolites in freeze-dried rice leaf samples.The freezedried leaf samples were crushed using a mixer mill (MM 400, Retsch) with a zirconia bead for 1.5 min at 30 Hz, 100 mg dried powder was weighted and extracted overnight at 4 with 1.0 mL of 70% aqueous methanol containing 0.1 mg/L lidocaine (internal standard) for lipid-solubility metabolites or water-soluble metabolites [28][29][30].Quantification of metabolites were carried out in a scheduled multiple reaction monitoring (MRM) mode.The relative signal intensities of the metabolites were standardized by firstly dividing them by the intensities of the internal standard and then log 2 transforming them to generate the final data matrix.

Genome-wide association analysis
Only SNPs with minor allele frequency (MAF) ≥ 0.05 and the number of varieties with a minor allele ≥ 6 in a (sub) population were used to carry out GWAS.Population structure was modeled as a random effect in Linear Mixed Model (LMM) using the kinship (K) matrix.We performed GWAS using LMM provided by FaST-LMM program [31].The genome-wide significance thresholds (P LMM ) was set to 2.61e-07 (0.05/191,487) after correction by the number of effective-independent SNPs [32], in which the 191,487 effective-independent SNPs for threshold calculation were obtained by using PLINK (https://doi.org/10.1086/519795,https://doi.org/10.1038/nprot.2010.116) to remove the linkage disequilibrium SNPs.

RNA extraction and sequencing
According to leaf sheath color and relative intensity of the three anthocyanidins, leaves of 10 wild rice accessions with highest anthocyanidins accumulation and purple leaf sheaths, and 10 cultivated rice accessions with lowest anthocyanidins accumulation and green leaf sheaths, were collected to extract total RNA and construct mRNA library for sequencing.Total RNA was isolated using trizol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocol.These cDNA libraries were amplified and sequenced on a BGISEQ-500 platform (BGI, Shenzhen, China).Raw reads including the adaptor sequences, low quality sequences, and unknown nucleotides were filtered into clean reads using standard quality control (QC) technique.The fragments per kilobase of transcript per million reads mapped (FPKM) method was used to calculate normalized expression levels using RNA-Seq by Expectation Maximization as previously described [33].

Statistical analysis
The metabolite data of wild rice and cultivated rice accessions in this study comprise the means of three technical replications from the LC-MS/MS of one biological replicates.For each individual metabolite, the content was given as the average of the normalized metabolite levels in two replications.Metabolite data were log 2 transformed to improve normality and normalized.The contents of three anthocyanidins in 311 rice accessions were used for hierarchical clustering analysis and visualization by R package heatmap version 1.0.12(https:// CRAN.R-project.org/package=pheatmap).

Overexpression and knockout of OsC1
The over-expression construct of OsC1 was generated by directionally inserting the full complementary DNA (cDNA) from wild rice accession DX386 into the vector pCAMBIA1300 under the control of the maize ubiquitin promoter.An sgRNA (5

'-C T C C G G C C T A A C A T C
A A G C G-3') was designed and linked to pYLCRISPR/ Cas9Pubi-H vector to generate OsC1 knockout lines.Both the plasmids of overexpression and knockout of OsC1 were introduced into Agrobacterium tumefaciens stain EHA105 to infect cultivated rice accession DX8 and wild rice accession DX386, respectively.A total of 22 and 16 transgenic positive plants (T0) were generated and named OE 10350 -1 to OE 10350 -22 and Δ 10350 -1 to Δ 10350 -16, respectively.After co-segregation tests, T1 progeny from three independent transgenic positive T0 plants for overexpression (OE 10350 -1 to OE 10350 -3) and knockout (Δ 10350 -1 to Δ 10350 -3) of OsC1 were used for further analysis.

Phenotype of transgenic lines
Three OsC1 overexpression lines with the control plant DX8, and three OsC1 knockout plants with the control plant DX386 were cultivated under the normal conditions with the same treatments for observing leaf sheath color and taking photos of the seedlings of all the transgenic lines and the controls.

Quantitative real time polymerase chain reaction (qRT-PCR)
Total RNA was extracted from leaf sheath of OsC1 overexpression plants and the control DX8 accession using RNA isolation kit (Magen).cDNA was generated in 25 µL reaction mixtures containing 2 µg DNase I-treated RNA, 200 U M-MLV reverse transcriptase (Takara), 40 U recombinant RNase inhibitor (Takara) and 0.1 µM oligo (dT) 18 primer.RT-PCR was performed in total volumes of 10 µL containing 5 µL SYBR premix EX Taq (Takara), 0.2 µL Rox Reference Dye II (Takara), 0.4 mM gene-specific primers and 0.5 µL cDNA on an ABI 7500 real time PCR system (Applied Biosystems).The ubiquitin gene Os03g234200 was used as an internal reference.

DNA extraction and PCR identification
Genomic DNA was extracted from leaf sheath of OsC1 mutation plants and the control DX386 accession using DNA extraction kit (TIANGEN).PCR was performed in total of 25 µL containing 12.5 µL Green Taq Mix (Vazyme), 1.0 µL DNA extraction, 1.0 µL OsC1-specific forward and reverse primers which are across the sgRNA.Fragments from PCR were cloned into pMD18-T vector and sequenced.

Genome resequencing and haplotype analysis
Rice leaf samples of 160 wild rice and 151 cultivated rice accessions were collected to construct sequencing libraries according to the manufacturer's instructions, and qualified libraries were sequenced using Illumina HiSeq platform.Quality of raw sequencing data were accessed using FastQC (v0.11.9) software [34].Clean data were mapped onto reference genome (MSU7) using BWA (0.7.17-r1188) software with default parameter [35].MarkDuplicates in Picard (2.12.1) was used to eliminate PCR duplication and sorting BAM files.All single nucleotide polymorphisms (SNPs), insertions and deletions (InDels) were called using HaplotypeCaller of Genome Analysis Toolkit (GATK, version 4.2.2.0) pipeline [36], and annotated using SnpEff (4.3 s) with the GFF3 file of MSU7 reference genome [37].Software beagle (v5.2) was used to impute missing genetic variations that generated by GATK [38].Although the accurate genomic phasing cannot be revealed by short reads sequencing, all genomic variations were still used for haplotyping OsC1 by the jointing of SNPs and InDels with the consideration of heterozygous sites to help illustrating the whole genetic diversity of common wild rice.Genomic variations of selected genes were extracted based on the positions by using BCFTools [39].Haplotype network of OsC1 was constructed by our previously described method [40].Haplotype network was constructed by Popart software [41].

Analysis of leaf sheath color and anthocyanidin accumulation in O. rufipogon and O. sativa
Significant difference in leaf sheath color between wild rice and cultivated rice was shown in Fig. 1A.146 of 160 (91.25%)O. rufipogon accessions showed purple leaf sheath, while 139 of 151 (92.05%)O. sativa accessions showed green leaf sheath (Supplementary Table S1).To investigate whether the accumulation patterns of anthocyanidin or other metabolites were responsible for purple leaf sheath in O. rufipogon and O. sativa, a widely-targeted metabolomics method [28] based on liquid chromatography-electrospray ionization-mass spectrometry (LC-ESI-MS/MS) was applied into the comprehensive profiling analysis of anthocyanidin level in the leaves at five-leaf stage (termed 'leaf ' hereafter) from the above rice accessions (Supplementary Table S1).Cyanidin-3-Galc, cyanidin 3-O-rutinoside and cyanidin O-syringic acid, established as colorant metabolites, were significantly higher accumulated in wild rice and showed 11.84-fold (P = 3.14E-52), 11.11-fold (P = 7.97E-20) and 4.60-fold (P = 2.47E-08) respectively, compared to cultivated rice (Fig. 1B and Table 1).Hierarchical clustering analyses (HCA) showed a visual normalized accumulation pattern, which showed the differences of relative content of these three metabolites in the two Oryza species (Supplementary Fig. S1).A series of correlation analyses showed positive correlation property (Pearson correlation, R = 0.82, 0.55, and 0.41, Student's t-test P-value<0.0001,respectively) between each of the three anthocyanidin metabolites and purple leaf sheath (Table 1).Compared with O. sativa, which showed green leaf sheath within few accumulation of anthocyanidins, O. rufipogon showed purple leaf sheath with significant higher accumulation of anthocyanidins.
To further screen the candidate gene, 10 wild rice accessions with purple leaf sheath and relative higher intensity of the three anthocyanidins, as well as 10 cultivated rice varieties with green leaf sheath and less accumulation of the three anthocyanidins, were used for RNAseq and transcriptome analysis.According to the rice genomic annotation, except for transposons and genes without expression in all selected samples, the remaining 25 genes (Supplementary Table S2) were located in a region which was shown in Fig. 2B.As shown in Figs.2C and 11 of 25 candidate genes had higher FPKM values in O. rufipogon than that in O. sativa.Of 11 candidate genes, a well-known gene, LOC_Os06g10350, which is annotated as a MYB transcription factor and named OsC1, had been reported to be responsibility for accumulation of anthocyanidins and color of vegetative tissues in cultivated rice [26].
In addition, three anthocyanidin biosynthesis relative genes, OsF3H, OsDFR, and OsANS, showed the same expression tendency as OsC1, with significant (P-value = 0.01209, 0.000545, 0.000917 and 0.000751, respectively) higher FPKM values in wild rice accessions than that in cultivated rice varieties (Fig. 2D).This result showed the anthocyanidins biosynthesis in O. rufipogon may be regulated by the three downstream genes of OsC1.

Functional characterization of OsC1 in anthocyanidins biosynthesis in O. rufipogon and O. sativa
To investigate the native function of OsC1 in O. rufipogon and O. sativa, we generated 3 mutants in DX386 (common wild rice) and DX8 (cultivated rice) backgrounds  respectively.Three OsC1 gene knockout (Δ OsC1 -1, Δ OsC1 -2, and Δ OsC1 -3) and three overexpressed (OE OsC1 -1, OE OsC1 -2, and OE OsC1 -3) lines were respectively verified by genome sequencing and qRT-PCR, respectively.As shown in Fig. 3A and B, compared to wild type O. rufipogon accession DX386 with purple leaf sheath and functional OsC1 coding region, three OsC1 gene knockout lines showed green leaf sheath and homozygous mutation of an ' A' base pair insertion at the position 69 of the second exon of OsC1.On the other hand, compared with the control O. sativa accession DX8 with green leaf sheath, overexpressed OsC1 gene resulted in purple leaf sheath in the three overexpressed lines (Fig. 3C) with significant higher expression levels (Fig. 3D; respectively as 7284, P-value = 0.022; 2226, P-value = 0.034; and 17,251, P-value = 0.032, folds).

Haplotype analysis of OsC1 with anthocyanidins intensity and color variations in natural wild rice and cultivated rice germplasm
Since purple leaf sheath and relative higher accumulation of the three anthocyanidins were regulated by OsC1, we could test whether the color-producing and metabolite-accumulating model are universal among natural wild rice and cultivated rice germplasm by analyzing OsC1 haplotypes.25 genome sequence variations of OsC1 were comprehensively analyzed in the total 311 rice accessions, combined with phenotypes of the leaf sheath color and the average relative intensity of the three anthocyanidins (Supplementary Table S3).9 haplotypes (Hap1-6, 18, 19, 21) contained at least 2 rice accessions in each one and totally contained 295 of 311 rice accessions (Fig. 4A).
As shown in Supplementary Table S3 and Fig. 4A, indepth analysis of OsC1 revealed eighteen functional haplotypes (Hap1-18) with relative higher intensity of the three anthocyanidins and seven non-functional haplotypes (Hap19-25) with nearly no accumulation of the three anthocyanidins.Hap1 and Hap19 respectively represented the major functional and non-functional haplotypes which contained 107 (60.80% of anthocyanidin-abundant rice) and 128 (94.81% of anthocyanidinabsent rice) rice accessions, with only one difference (10 bp deletion, '-A C T G G A A C A G-') at the position from 881 nt to 890 nt of coding sequence of OsC1.All rice accessions in Hap19, including 6 wild rice and 122 cultivate rice varieties, consistently showed green leaf sheath without accumulation of the three anthocyanidins.95.33% rice accessions in Hap1, including 101 wild rice and 1 cultivated rice varieties, also consistently showed purple leaf sheath with relative higher accumulation of the three anthocyanidins although 3 wild rice and 2 cultivated rice varieties showed unmatched phenotype (green leaf sheath with high accumulation of the three anthocyanidins).This result demonstrated that the variation (10 bp deletion) was the major determinant of color pigmentation and anthocyanidins accumulation.18 of 21 rice accessions in Hap2 which had only one difference from Hap1 at the variation location 881 nt for a heterozygous genotype 'T/T A C T G G A A C A G' also showed purple leaf sheath with relative higher intensity of the three anthocyanidins, except other 3 (1 wild rice and 2 cultivated rice varieties) showed green leaf sheath.Compared with Hap1, Hap3-17, which were consist of 27 wild rice accessions with different heterozygous genotype variations at different locations of DNA sequence of OsC1.In addition, we also found that 6 rice materials (Hap21-25) without 10 bp deletion showed green leaf sheath with few accumulation of the three anthocyanidins.Mutation from 'T' to ' A' (Hap21, missense variant, Fig. 4B), a 'T' insertion (Hap22, frameshift variant, Fig. 4C), and mutation from 'T' to 'C' (Hap24, missense variant, Fig. 4D) may be new haplotypes of OsC1 for regulating purple pigmentation and anthocyanidins accumulation in rice leaf sheath.

Discussion
Most modern cultivated rice (O.sativa) varieties present green vegetative tissues (leaf, leaf sheath and leaf margin) with few accumulation of anthocyanidins.On the contrary, most wild rice (O.rufipogon) plants, as the ancestor of cultivated rice, are rich in anthocyanidins and show various colors in different tissues [42].In our research, 146 of 160 (91.25%) wild rice plants showed purple leaf sheath with significant higher accumulation of cyanidin-3-Galc, cyanidin 3-O-rutinoside and cyanidin O-syringic, than the cultivated rice accessions, most of which (125 of 151, 82.78%) showed green leaf sheath with significant less accumulation of the three anthocyanidins.Different colors of leaf sheath and significant different accumulation of the three anthocyanidins between O. rufipogon and O. sativa demonstrated that the characterization of purple leaf sheath was artificially threw away during breeding, along with reduced accumulation of anthocyanidins, that is a similar result and conclusion with previous studies [19,25,27].
Although development of the second generation of sequencing and application of genome-wide association study have rapidly promoted functional characterization of genes associated with complex traits in rice [43][44][45], the linkage imbalance of the genome and the imbalance of the population structure normally resulted in primary mapping [25] and false association between the objective phenotype and putative gene [46,47].Accurate identification of phenotype is an important factor that decide the efficiency of GWAS.In this study, a wellknown gene regulating leaf sheath color, OsC1, was colocated accurately and confirmed mutually by combining mGWAS and pGWAS.Identification and quantitation of metabolites through widely targeted metabolites profiling, as a repeatable and verifiable indicator, enhanced the degree of accuracy of mGWAS.Additionally, application of multi-omics, such as mGWAS and pGWAS, could improve the efficiency of mapping genes through co-location of metabolites and phenotypes, and lay the foundation for analyzing genetic relationship between metabolites and phenotypes.Pigmentation, attributed to accumulation of anthocyanidins, occurred both in rice leaf sheath and grains.Although key genes and regulation pathway in rice grains are different from that in rice leaf sheath, the high-efficiency and accurate application in rice leaf sheath could be used for reference in rice grains.For example, based on the whole genome resequencing, identifying and classifying the phenotype of color in rice grain for pGWAS, detecting the anthocyanidins content for mGWAS, transgenic researching on the loci especially co-located by pGWAS and mGWAS.
In this study, we used mGWAS and pGWAS to fast and accurately identify OsC1 as a regulator of three anthocyanidins biosynthesis based on natural population (160 wild rice accessions and 151 cultivated rice varieties), while Zheng et al. [19] used pGWAS of anthocyanin content based on a worldwide collection consisting of 533 cultivated rice accessions.OsC1 had been initially identified by homology mapping in maize [26,[48][49][50][51]. Haplotype analysis showed that the major variation, '10bp' deletion or presence at the position 881 nt -890 nt, which was also found by Zheng et al. [19] and Sun et al. [27], could explain the difference of leaf sheath color and intensity of the anthocyanidins in 123 (81.46% of 151) cultivated rice varieties and 146 (91.25% of 160) wild rice accessions in this study.It has been reported that three kinds of indels were identified in which 10 bp deletion occurred in almost all indica varieties, whereas -TC and -GAG deletions mainly occurred in temperate japonica accessions in the non-functional haplotypes [27].We found the majority variation as reported because our cultivated rice varieties were belonging to indica varieties in south China.Other two common variations were found by Zheng et al. (Hap4 and Hap5 in their article) and in our study (Hap 7 and Hap 18).In addition, we also found three types of variations (Fig. 4B), namely 'T' to ' A' mutation (Hap21), single 'T' insertion (Hap22), and 'T' to 'C' mutation (Hap24), as new haplotypes of OsC1 for regulating purple pigmentation and anthocyanidins accumulation in rice leaf sheath.RiceNavi is a brilliant design for rice molecular breeding, which provide a highly efficient platform for the usage of genomic knowledge in rice breeding [52].Artificial selection for the newly identified haplotypes of OsC1 in breeding could be assisted by RiceNavi, which will facilitate the selection of rice varieties with lacked anthocyanidins.Anthocyanidins biosynthesis and accumulation in rice may be simultaneously regulated by other genes and pathways except OsC1, because there were still 2.5% wild rice and 9.27% cultivated rice accessions showed contradiction between the leaf sheath color and relative intensity of the anthocyanidins by the control of the major variation of OsC1.

Conclusion
Metabolome analysis revealed that the significant higher accumulation of anthocyanidins was responsible for the change of leaf sheath color from green in O. sativa to purple in O. rufipogon, which is widely demonstrated to regulate the color of many plants.Combination of phenotypic and metabolic genome-wide association studies accurately and fast co-located a well-known MYB transcript factor encoding gene OsC1 which was reported to responsible for coloration in various of rice tissues.Functional characterization of OsC1 in our study not only revealed that OsC1 regulates leaf sheath color both in O. rufipogon and O. sativa, but also verified a high accuracy and efficiency of multi-omics that applied to identify candidate genes related to traits.The present study provided more rice germplasm within high intensity of anthocyanidins and new potential variations of OsC1 which could benefit for rice breeding and molecular mechanism in accumulation of anthocyanidins.

Fig. 2
Fig. 2 Mapping of OsC1 using GWAS and expression analysis of anthocyanidins biosynthesis relative genes.(A) Manhattan plots for GWAS of 3 anthocyanidins traits and leaf sheath color across 12 rice chromosomes.The strength of association is indicated as the negative logarithm of the P value for the linear mixed model.All metabolite-/phenotype-SNP associations with P value below 2.61E-07 (horizontal dashed line) are plotted against the genome location in intervals of 1 Mb.(B) Regional Manhattan plot for 3 anthocyanidins traits and leaf sheath color trait in 5.15 Mb − 5.50 Mb region on chromosome 6.(C) Heatmap of 10 wild and cultivated rice accessions by normalized log2 of relative content of the three anthocyanidins and FPKMs of 25 candidate genes in the region located commonly by mGWAS and pGWAS.Candidate gene OsC1 (LOC_Os06g10350) was noted by red font.(D) Expression analysis of OsC1 and anthocyanidin biosynthesis relative genes (OsF3H, OsDFR and OsANS) in Oryza rufipogon and Oryza sativa.'***' and '*' indicate p-value<0.001and 0.05 for t-test, respectively

Table 1
Comparation of relative intensity of three anthocyanidins in Oryza rufipogon and Oryza