A rich TILLING resource for studying gene function in Brassica rapa
© Stephenson et al. 2010
Received: 6 November 2009
Accepted: 9 April 2010
Published: 9 April 2010
Skip to main content
© Stephenson et al. 2010
Received: 6 November 2009
Accepted: 9 April 2010
Published: 9 April 2010
The Brassicaceae family includes the model plant Arabidopsis thaliana as well as a number of agronomically important species such as oilseed crops (in particular Brassica napus, B. juncea and B. rapa) and vegetables (eg. B. rapa and B. oleracea).
Separated by only 10-20 million years, Brassica species and Arabidopsis thaliana are closely related, and it is expected that knowledge obtained relating to Arabidopsis growth and development can be translated into Brassicas for crop improvement. Moreover, certain aspects of plant development are sufficiently different between Brassica and Arabidopsis to warrant studies to be carried out directly in the crop species. However, mutating individual genes in the amphidiploid Brassicas such as B. napus and B. juncea may, on the other hand, not give rise to expected phenotypes as the genomes of these species can contain up to six orthologues per single-copy Arabidopsis gene. In order to elucidate and possibly exploit the function of redundant genes for oilseed rape crop improvement, it may therefore be more efficient to study the effects in one of the diploid Brassica species such as B. rapa. Moreover, the ongoing sequencing of the B. rapa genome makes this species a highly attractive model for Brassica research and genetic resource development.
Seeds from the diploid Brassica A genome species, B. rapa were treated with ethyl methane sulfonate (EMS) to produce a TILLING (Targeting Induced Local Lesions In Genomes) population for reverse genetics studies. We used the B. rapa genotype, R-o-18, which has a similar developmental ontogeny to an oilseed rape crop. Hence this resource is expected to be well suited for studying traits with relevance to yield and quality of oilseed rape. DNA was isolated from a total of 9,216 M2 plants and pooled to form the basis of the TILLING platform. Analysis of six genes revealed a high level of mutations with a density of about one per 60 kb. This analysis also demonstrated that screening a 1 kb amplicon in just one third of the population (3072 M2 plants) will provide an average of 68 mutations and a 97% probability of obtaining a stop-codon mutation resulting in a truncated protein. We furthermore calculated that each plant contains on average ~10,000 mutations and due to the large number of plants, it is predicted that mutations in approximately half of the GC base pairs in the genome exist within this population.
We have developed the first EMS TILLING resource in the diploid Brassica species, B. rapa. The mutation density in this population is ~1 per 60 kb, which makes it the most densely mutated diploid organism for which a TILLING population has been published. This resource is publicly available through the RevGenUK reverse genetics platform http://revgenuk.jic.ac.uk.
The advent of high-throughput sequencing technologies, vast genomic databases and increasingly powerful genetic tools has had a huge impact on the development of our understanding of the biochemical and developmental networks regulating the multitude of genetic and physiological processes in plants . Insight from studies in the model species, Arabidopsis thaliana, is increasingly facilitating our ability to elucidate and beneficially exploit key regulatory processes in relevant crop species. The last decade has seen the development of a number of large-scale 'Reverse Genetics' tools to study the effects of mutations in genes for which the sequence is known. These tools include T-DNA insertion , TILLING (Targeting Induced Local Lesions In Genomes)  and RNAi technologies [4–6].
TILLING is a reverse genetics tool, which was originally developed for Arabidopsis  and has subsequently been successfully employed in other plant species as well as animal species (eg. [8–17]). For plants, large mutant populations are generated by the treatment of seed or pollen with a chemical mutagen - most commonly ethyl methane sulfonate (EMS) - that can induce point mutations at a very high density, sufficient to establish a series of allelic mutations in all genes. Amplified sequences are then screened using established high throughput SNP discovery methods.
Brassica napus (oilseed rape) is an amphidiploid species containing two diploid genomes originating from a cross between the diploid Brassica species, B. rapa and B. oleracea. Whereas TILLING populations have been described for B. napus and B. oleracea [13, 15], such a resource has not yet been reported for B. rapa.
EMS is a mutagenic, teratogenic and possibly carcinogenic organic compound and it is the mutagen of choice for the development of plant TILLING populations [7–15]. It produces random mutations in genetic material by nucleotide substitution; primarily by alkylation on the O6 position of guanine leading to GC→AT transition changes.
Here we describe the development of a TILLING population in B. rapa genotype R-o-18 . The effect of EMS on plant growth and fertility in the M1 and M2 generations is described. Based on the screening of six genes located on different chromosomes, we calculated a mutation density of ~1 per 60 kb and a 97% probability of identifying a stop-codon mutation in a standard screen of 3072 M2 plants. This resource therefore comprises an attractive tool for researchers having interests in plant development and especially with regard to phenotypic traits related to improvement of oilseed rape and other crops.
EMS is a highly volatile and unstable compound and we have previously observed variation in the activity between batches. In order to avoid having to repeat the optimisation experiment with a different EMS batch after the titration, we used the same fresh batch of EMS to treat 5,000 seeds for each concentration in the 0.2-0.5% range (expected to be most relevant for producing the population based on previous experience) and 200 seeds at the concentrations outside this range.
After incubation and washes, the seeds were sown in soil and kept at 7°C for six days. Following a further six days in the glass house, germination frequency was established. Germination was hardly affected by treatments up to 0.3%. However, at 0.4% a marked decrease was observed and at 1% EMS none of the seeds germinated, indicating that the EMS treatment had been effective.
For Medicago truncatula mutagenesis, it was previously reported that an EMS concentration at the point where germination begins to become compromised is optimal for obtaining a large mutation load while maintaining vigorous and fertile plants . For the B. rapa population we therefore decided to use seedlings derived from the 0.3% and 0.4% EMS treatments for population development, and the resulting two populations will subsequently be referred to as the '0.3% population' and the '0.4% population'.
Although, only seedlings from the 0.3% and 0.4% EMS-treated seeds were used to make up the M1 generation of the mutant population, a subset of the germinated seedlings from the remaining concentrations in the titration experiment were allowed to grow on. As shown in Figure 2b and 2c, the higher concentrations of EMS also inhibited seedling establishment as well as plant growth.
When treating the seeds with the mutagen, a subset of cells in the shoot apical meristem of the embryo will carry the mutations on to the next generation. This provides an opportunity for multiple cell lineages to be subject to a different spectrum of mutations, and it is therefore possible for gametes arising from different floral primordia to carry a distinct subset of mutant alleles .
Since B. rapa plants are larger, generate less seed and have a longer life cycle relative to, for example, Arabidopsis thaliana ecotypes Col-0 and L-er (5-6 months versus 6 weeks from seed to seed), it is desirable to minimise the number of plants necessary to build a useful resource. Under optimal conditions, we estimated that a sufficient number of the mutations would be recovered by using material from two M2 plants from each of the ~5,000 M1 plants assuming a similar number of progenitor cells as in Arabidopsis. Ten seeds from each M1 individual were planted to increase the probability that at least two would germinate.
During growth of the M2 population a number of phenotypes were observed; the percentage of M2 families with albinos was 4.5% for both the 0.3% and 0.4% populations and we observed a plethora of morphological defects at developmental stages, ranging from seedlings to the fruit stage. Galleries of selected phenotypes are shown in Additional files 1, 2, 3 and 4.
For most M2 families, 5-10 seeds germinated and in these cases we always took leaf tissue from the two most healthy-looking individuals and discarded the rest. In this way, we obtained vigorous and mostly fertile plants, while expecting to maintain the high mutation level in a heterozygous state. The plants were subsequently bagged to prevent pollination between plants and M3 seeds were harvested.
Upon harvesting, we recorded the fertility and found that 9.6% of the M2 plants from the 0.3% population and 27.9% of the M2 plants from the 0.4% population failed to set seeds suggesting a higher mutation load in the 0.4% population.
DNA was isolated from the tissue, the concentration of DNA accurately determined and stocks normalised to ensure that DNA pools were balanced such that all individual lines were equally represented within the pools.
We used a standard one-dimensional pooling strategy where each M2 line is represented only once in a single pool, with each pool comprising DNA from eight M2 lines. Such a design is ideally suited to high throughput mutation detection. Specifically, eight-pools of 6,912 M2 plants originating from the 0.3% population were distributed in nine 96-well plates (DNA from 768 M2 plants per plate), whereas DNA pools from 2,304 M2 plants from the 0.4% population were distributed in three plates.
Mutations in genes of interest are detected by Cel1 digestion at a mismatched base pair [22, 7]. For the B. rapa population described here, identification of digested fragments were carried out on an ABI3730 sequencer using fragment lengths of ~1 kb and a previously established protocol . Individual M2 lines from pools identified as containing a mutant allele were subsequently sequenced in order to confirm the presence of the mutation, reveal its identity and to identify the M2 line carrying the mutation. Since the Cel1-digested product is verified with labelled primers from both ends , the level of false positives is essentially zero.
When initiating a TILLING screen in a particular gene of interest, it is useful to analyse the coding region with reference to the genetic code. Only a subset of all combinations of amino acid changes are achievable when using EMS as a mutagen. Firstly, eight out of the 64 codons (12.5%) are unaffected by EMS-induced mutations because they do not contain guanine or cytosine. Secondly, out of the 96 positions that can be mutated (G→A or C→T) within the genetic code, 33 would not lead to an amino acid sequence change (silent mutations). Of the remaining 63 mutable positions of the genetic code, 58 would give rise to 26 amino acid substitutions (mis-sense mutations) and 5 would result in stop codons (nonsense mutations). Nine of the possible amino acid changes (corresponding to mutations at 21 out of 58 sites) result in chemically similar amino acids being incorporated which in many cases would be less likely to alter the function of the encoded protein significantly.
Mutations leading to premature stop codons are often desirable as they are expected to provide a dramatic reduction in gene function, especially when proximal to the 5' end of the open reading frame. However, out of the 96 mutable positions, there are only five ways in which a stop codon can be obtained. These comprise the two glutamine codons (CAA and CAG), one of the six arginine codons (the C of CGA) and the tryptophan codon (TGG) for which G→A mutations at either position will generate a stop codon. The genetic code therefore has considerable robustness built-in, which minimises the potential biological effect of point mutations. It may therefore be beneficial to target the analysis of the gene of interest to a region in the sequence where it is possible to realise fully the potential of mutations that may reduce or abolish the activity of the encoded protein. To assist us in this analysis, we use the software package CODDLE , which is a programme designed to identify areas within the gene with highest probability of affecting gene function when mutated by EMS.
Selecting a suitable amplicon for mutation detection is a pre-requisite for establishing efficient and successful screens, and several considerations need to be taken into account: 1) In the large majority of cases, it will be advantageous to include as much coding sequence as possible and avoid intron or intragenic sequence. 2) Repetitive sequence may cause 'Taq slippage' which could delete or insert extra repeats. This will lead to artifactual mismatches between wild type and 'slippage' strands, which may become substrates for the Cel1 nuclease. 3) As mentioned above, only a subset of codon-changes is likely to have dramatic effects on gene activity. It is therefore advisable to identify the region with most potential for generating stop codons and significant amino acid changes. 4) Finally, it is important to test for paralogue-specificity as up to three copies of each single-copy Arabidopsis gene may be present in the diploid Brassica genome [23, 24]. When designing primers for the region of interest, it is therefore essential to verify that this primer set only amplifies the expected sequence before initiating the TILLING screen.
We identified six genes that are located on different B. rapa chromosomes. These were expected to be orthologues of the Arabidopsis REPLUMLESS (RPL; At5 g02030) , INDEHISCENT (IND; At4 g00120)  and METHYLTRANSFERASE1 (MET1; At5 g49160)  genes and were named BraA.RPL.a, BraA.RPL.b, BraA.RPL.c, BraA.IND.a, BraA.MET1.a and BraA.MET1.b, respectively, according to the accepted gene nomenclature system for the Brassica genus . These B. rapa gene sequences were isolated using the Arabidopsis thaliana Integrated Database http://atidb.org as described in the Methods section.
Results from eight TILLING assays in the B. rapa mutant population.
Screened M2 plants
Est. mutations per M2 plant*
M2 plants in population
Expected mutations kb-1...
This density is the highest reported for a TILLING population in any plant or animal diploid species. Only in populations of tetraploid and hexaploid wheat [10, 11] and the amphidiploid B. napus  have higher mutation densities been obtained.
Distribution of mutation classes.
Using 500 Mbp as an approximate genome size for B. rapa , it was deduced by extrapolation that each plant contains close to 10,000 mutations (Table 1). One might expect this level of mutations to be lethal. However, as only 11% of the genome is coding sequence , we expect no more than 1,100 point mutations within exons of which about 700 will have the potential to cause amino acid changes and approximately 50 could introduce new stop codons. With the high heterozygotes: homozygote ratio (Table 2) and a high level of redundancy, it may therefore not be totally unexpected that it is possible to generate a large number of M2 plants with such a high mutation density.
Mutations in the BraA.RPL and the BraA.IND.a genes were asymmetrically distributed along the amplicons (Figure 4). This likely reflects the location of the 5' primers in the BraA.RPL genes approximately 300 bp upstream of the start codon in an AT-rich non-coding region, necessary to ensure locus specificity. At the 3' end, however, the primer was positioned within the first exon in an area of more mutable sites (higher GC content).
Bearing these numbers in mind, it is therefore surprising that no nonsense mutations were detected within the first 3,072 M2 plants for two out of the six amplicons tested here (BraA.RPL.b and BraA.RPL.c in Table 1). Screening an additional 1,536 M2 plants also did not result in any stop codon 'hits', whereas two nonsense mutations were detected for BraA.RPL.b after screening the 2,304 plants from the 0.4% population (Table 2).
It is unlikely that this discrepancy is due to lethality of nonsense mutations in these genes. Firstly, we have identified three closely related paralogues suggesting that these genes may function redundantly. Secondly, mis-sense mutations in each individual do not have any detectable effect on plant development (data not shown).
These observations suggest that the lack of nonsense mutation detection in these amplicons is due to limitations of the detection method, which may be related to features in the DNA sequence.
Although B. rapa is diploid, it is still a paleopolyploid having undergone an ancient triplication event [23, 24, 32, 33]. Therefore one can expect to find up to three paralogous genes of each single-copy Arabidopsis gene in the B. rapa genome. Subfunctionalisation may have evolved in some cases. However, a high level of functional redundancy among the paralogues probably exists, and it may therefore be necessary to combine mutants to observe the desired effect. It is likely that this redundancy between paralogues allows B. rapa to harbour such a high mutation density compared to other diploid species. Another factor may be the high level of heterozygosity obtained especially in the 0.3% population, which to our knowledge is higher than for any other TILLING populations reported. The strategy of selecting healthy-looking M2 plants for the population will have contributed to this. One might predict that a significant number of M3 plants would be severely impaired in development due to a high level of homozygosity. In these cases it is recommended to carry out a backcross to wild type in the M2 generation to remove part of the background.
In accordance with the redundancy argument, we did not observe any developmental defects resulting from mutations in individual BraA.RPL paralogues. In contrast mis-sense and nonsense mutations in the BraA.IND.a gene, which is a single-copy gene in B. rapa, result in indehiscent phenotypes as in Arabidopsis (T. Girin, P. Stephenson, C. M. P. Goldsack, S. Perez, N. Pires, P. A. Sparrow, T. A. Wood and L. Østergaard - manuscript accepted). B. rapa therefore appears to provide a highly suitable compromise between being able to accommodate a high mutation density, whilst still presenting visible phenotypes (see also Additional Files 1, 2, 3 and 4). This is in contrast to reports in eg. hexaploid wheat where a high mutation density is achieved, but visible phenotypes are rare .
A classical backcrossing programme to remove the undesirable background mutation load is a prolonged procedure, which is expensive in both time and resources. This is especially true where a genus such as Brassica produces large plants with a relatively long generation time. Each backcross generation reduces the mutation load by 50%; so reducing the number of mutations from 10,000 to ten will take ten generations of backcrossing and genotyping (10,000 × 0.510 = 10).
As an alternative to embarking on such a programme, we instead assess the correlation between mutation and phenotype by comparing homozygous recessive mutants to heterozygotes and homozygous wild-type sibling plants in a segregating population for which the background mutations are the same. A 100% correlation between homozygous mutants and phenotype would lend strong support to the hypothesis that the mutation in the 'TILLed' gene is responsible for the phenotype. Moreover, we also aim to obtain allelic series of independent mutations in the same gene, where related phenotypic variants would strongly associate the phenotype with the gene, thereby avoiding the necessity for a lengthy backcrossing scheme. Detailed description on both of these approaches is provided in .
Here we describe the development of a TILLING population in the Brassica rapa genotype R-o-18 suitable for reverse genetics studies. The high mutation density in this diploid species makes it an attractive genetic system for studying plant development and especially for obtaining mutations contributing to phenotypic traits related to crop improvement of oilseed rape. With imminent availability of a complete B. rapa genome sequence expected in the very near future, this resource will have particular appeal since time-consuming gene isolation and design of paralogue-specific primers will become a relatively straightforward informatic exercise.
This population is publicly accessible and available via the RevGenUK reverse genetics platform http://revgenuk.jic.ac.uk.
Seeds of the Brassica rapa genotype, R-o-18 (age 6-12 months since harvesting) were used in this work. For the negative control and 0.1%, 0.6%, 0.8% and 1% concentrations 200 seeds were treated in 10 ml solution in 50 ml Falcon tubes. For the 0.2%, 0.25%, 0.3%, 0.4% and 0.5% concentrations 5,000 seeds were treated in 250 ml (divided into 25 50 ml Falcon tubes for each of these concentrations). Seeds were soaked in 0.02% Tween 20 for 30 minutes prior to the addition of the EMS and incubated overnight (16 hours). 50 ml Falcon tubes with 200 seeds in each were turned end over end causing the seeds to tumble through the solution ensuring that all the seeds were exposed equally to the EMS without incurring too much physical damage. After treatment the seeds were washed ten times with 0.02% Tween 20 and then mixed with fine grade vermiculite to facilitate their even distribution onto 348 mm × 220 mm seed trays. Seeds were sown at an approximate density of 200 seeds/tray and kept at 7°C with no light for six days before being transferred into the glasshouse at 18°C with 16 hours light. After a further six days the percentage of germination was recorded. These M1 seedlings were transplanted first into 24-well modules and then into 1L pots in John Innes no. 2 compost. The developing plants were bagged individually before they flowered using perforated bread bags (Packaging Company Unit, UK) to prevent cross-pollination, and allowed to proceed to maturity.
Dried pods were threshed and the M2 seeds from each line were deposited in the John Innes Centre seed-store (1.5°C, 7-10% relative humidity) to ensure their long-term viability. Ten seeds from each line were sown into 1L pots and transferred directly to the glasshouse. After germination the number of albino plants was recorded along with a range of other notable seedling and early leaf variant phenotypes. Each line was then thinned out to two 'healthy' plants per M2 line, leaf samples were taken for DNA isolation, and the plants were bagged individually and allowed to grow to maturity. Plants were harvested, threshed and M3 seed was placed in the seed-store.
Leaf material was collected from each M2 plant selected for progression through to M3 seed and transferred directly into Qiagen racks on dry ice. DNA isolation was carried out using the DNeasy Plant 96 Qiagen Kit for 96 samples following the manufacturer's instructions (Qiagen, UK). DNA concentrations were determined using PicoGreen (Molecular Probes, Invitrogen Corporation, Carlsbad, California, USA) against a universal DNA concentration standard on a Tecan Genios plate reader. All samples were normalised to 0.5 ng/μl (diluted in deionised water).
A simple one-dimensional eight-fold pooling strategy was employed using a Xiril liquid handling robot (Xiril AG, Hombrechtikon, Switzerland). Final DNA concentration in an eight-pool was 0.5 ng/μl, and 5 μl of eight-pool DNA were used in a TILLING reaction. This leads to a total of 2.5 ng DNA (~0.3 ng from each M2 line) in the TILLING reaction.
Primers were designed using the CODDLE (codons optimized to discover deleterious lesions; http://www.proweb.org/coddle) programme , combined with the PRIMER3 tool  to define the best amplicon for TILLING, aiming for a predicted primer Tm of 60-70°C. The primers used in this work are listed in Additional File 5. CODDLE identifies areas within the gene which have the highest probability of affecting gene function when mutated by EMS, and scores possible mis-sense and nonsense changes.
Mutant detection was carried out by Cel1 digestion followed by analysis on a capillary ABI3730 sequencer (Applied Biosystems, Foster City, California, USA) as described in .
The Arabidopsis thaliana Integrated Database (AtIDB - http://atidb.org) was used to blast the Arabidopsis genes, REPLUMLESS , INDEHISCENT  and METHYLTRANSFERASE1  against a database of B. rapa BAC end sequences that have been mapped onto the Arabidopsis genome. BAC clones likely to contain the orthologous gene were identified based on synteny and primers were designed to sequence the genes directly on purified BAC DNA. No BACs were identified harbouring the BraA.MET1 genes. Instead these sequences were obtained from R-o-18 on a sequence homology-approach based on a previous publication .
From the John Innes Genome Laboratory, we are grateful to Bethany McCullagh (now part of the RevGenUK project) for running the TILLING platform, Richard Goram for DNA extractions and assistance in the TILLING screens and Fran Robson and Jillian Perry from RevGenUK for TILLING assistance. We also wish to thank Shirley Aris, Harry Grey, Alicia Grix, Lucy Hicks, Harriet Saunders, members of the Østergaard lab, members of the Bancroft lab and the John Innes Centre Horticultural staff for their valuable and skilled help with planting, bagging, harvesting, threshing, aliquoting seeds and isolating tissue for DNA extraction. We are also grateful to Tilly Eldridge and Lucy Hicks for assisting in the analysis of M1 plant fertility and M2 seed viability. Finally we wish to thank Judith Irwin and Cristobal Uauy for useful discussions and comments on the manuscript and Martin Trick for providing B. rapa genome information prior to publication. This work was supported by an 'Innovations in Crop Science' initiative grant (BB/E006965/1) from the Biotechnology and Biological Sciences Research Council (BBSRC) to LØ and JI as part of the AdVaB project and by RevGenUK (BB/F010591/1) also from the BBSRC to LØ.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.