Coffee cysteine proteinases and related inhibitors with high expression during grain maturation and germination
- Maud Lepelley†1,
- Mohamed Ben Amor†1, 2,
- Nelly Martineau1,
- Gerald Cheminade1,
- Victoria Caillet1 and
- James McCarthy1Email author
© Lepelley et al; licensee BioMed Central Ltd. 2012
Received: 13 November 2011
Accepted: 1 March 2012
Published: 1 March 2012
Cysteine proteinases perform multiple functions in seeds, including participation in remodelling polypeptides and recycling amino acids during maturation and germination. Currently, few details exist concerning these genes and proteins in coffee. Furthermore, there is limited information on the cysteine proteinase inhibitors which influence the activities of these proteinases.
Two cysteine proteinase (CP) and four cysteine proteinase inhibitor (CPI) gene sequences have been identified in coffee with significant expression during the maturation and germination of coffee grain. Detailed expression analysis of the cysteine proteinase genes CcCP1 and CcCP4 in Robusta using quantitative RT-PCR showed that these transcripts accumulate primarily during grain maturation and germination/post germination. The corresponding proteins were expressed in E. coli and purified, but only one, CcCP4, which has a KDDL/KDEL C-terminal sequence, was found to be active after a short acid treatment. QRT-PCR expression analysis of the four cysteine proteinase inhibitor genes in Robusta showed that CcCPI-1 is primarily expressed in developing and germinating grain and CcCPI-4 is very highly expressed during the late post germination period, as well as in mature, but not immature leaves. Transcripts corresponding to CcCPI-2 and CcCPI-3 were detected in most tissues examined at relatively similar, but generally low levels.
Several cysteine proteinase and cysteine proteinase inhibitor genes with strong, relatively specific expression during coffee grain maturation and germination are presented. The temporal expression of the CcCP1 gene suggests it is involved in modifying proteins during late grain maturation and germination. The expression pattern of CcCP4, and its close identity with KDEL containing CP proteins, implies this proteinase may play a role in protein and/or cell remodelling during late grain germination, and that it is likely to play a strong role in the programmed cell death associated with post-germination of the coffee grain. Expression analysis of the cysteine proteinase inhibitor genes suggests that CcCPI-1 could primarily be involved in modulating the activity of grain CP activity; while CcCPI-4 may play roles modulating grain CP activity and in the protection of the young coffee seedlings from insects and pathogens. CcCPI-2 and CcCPI-3, having lower and more widespread expression, could be more general "house-keeping" CPI genes.
KeywordsCysteine proteinase Cysteine proteinase inhibitor Proteinase activity Coffee
Cysteine proteinases (CP) represent a large group of proteins in plants, with over 140 annotated gene sequences identified to date in the Arabidopsis genome [1–3]. As expected for such a large family, the functions of these proteins are diverse, ranging from involvement in programmed cell death (PCD) [4, 5] to influencing tissue development [6, 7] and pathogen response signalling [8, 9]. During seed development, cysteine proteinases have been found to participate in PCD events associated with embryogenesis  and seed coat formation , as well as playing a role in the processing of proteins, particularly the seed storage proteins found in protein storage vacuoles . Different cysteine proteinases are also thought to make a major contribution to the mobilization of the stored seed protein reserves as germination progresses [13, 14]. In germinating mung bean seeds, it has been shown that at least two cysteine proteinases are induced soon after germination has started , and these authors proposed that vacuolar receptors (VCRs) transport these newly made proteinases to the protein storage vesicles (PSVs) thereby enabling them to participate in the mobilization of the seed protein reserves.
In plants, protein hydrolysis via cysteine proteinases is thought to be modulated, at least in part, by a group of proteins called the cysteine proteinase inhibitors. These polypeptides, also called phytocystatins, are a group of plant polypeptides that inhibit C1A and C13 type plant cysteine proteinases by acting as pseudosubstrates [16, 17]. While it is believed that the key biological function of the plant cysteine proteinase inhibitors (CPI) is to modulate the function of target proteinases in-vivo, to date, only a limited number of CPI have been tested with plant cysteine proteinases. In one such study , the inhibitory effects of a series of recombinant barley CPI were tested against multiple barley cathepsin L-like cysteine proteinases. These authors showed that most of the barley CPIs showed activity against all the CP's tested, although a few CPI did show increased inhibition effects towards one or two specific barley cysteine proteinases. CPIs have attracted particular attention due to their capability to inhibit cysteine proteinases found in the digestive tracts of herbivorous insects, an effect that can significantly reduce the destructive effects of these insects [18, 19]. For example, Urwin et al.  showed that over-expression of sunflower or rice CPI polypeptides in potato increased its resistance to Globodera root nematodes, and it has been demonstrated that simultaneously over-expressing a CPI with a second protease inhibitor acting on another protease family (carboxypeptidases) allowed tomato plants to have protection for a longer duration from two different tomato pathogens due to a reduced build-up of insect tolerance . Plant CPIs have been also been demonstrated to increase tolerance to fungal and bacterial pathogens in transgenic plants .
Coffee is one of the most important agricultural commodities traded worldwide, however, there continues to be a lack of fundamental knowledge on many aspects of this crop. To date, for example, there is little information on the proteinase and proteinase inhibitor genes of coffee. As shown above, the cysteine proteinases and their inhibitors play important roles in plant seeds. Thus, we decided to begin an investigation of the CP/CPI genes expressed in the semi-recalcitrant coffee grain. In addition, because amino acids and peptides are an important group of coffee flavour/aroma precursors in coffee [23, 24], such a study could also yield some clues regarding the potential role of CP/CPI gene products on coffee quality. In this work, we describe cDNA representing several coffee CP and CPI genes, and we present the expression of these genes in developing and germinating grain. To begin studying the functional properties of two highly expressed CP proteins, we have also expressed these proteins in E. coli and tested the recombinant polypeptides for protease activity. The results obtained are discussed in relation to the potential roles of the gene products in the development and germination of the coffee grain.
The Coffea canephora (BP409) "maturation" tissues (roots, branches, leaves and cherries at different stages of development) were harvested in 2007 from field grown trees (Equator), immediately put into liquid nitrogen, then held at -20°C before being sent frozen to Tours, France. Once at Tours, these samples were kept at -80°C until use.
Coffee cherries of Coffea canephora (BP409) used to obtain the "germination" tissues were harvested at mature stage from field grown trees in Equator in 2008, and sent to Tours at room temperature. On arrival, they were manually depulped, washed, and the light grain removed by floating. The remaining grain were dried and the tegument were manually removed. Subsequently, the grain were sterilized by a 1 h treatment in calcium hypochlorite (50 g/l), followed by three washes using sterile water. The grain were then incubated in vitro on Heller medium without added sugar or hormone (Agar 7 g/l), at room temperature (25°C). Then five grain were harvested at various times (DAI = Days After Imbibition), and frozen in liquid nitrogen. In the experiment presented, 14DAI (T4) corresponds to the first sign of radical emergence.
Coffea arabica (T2308) leaves were harvested at different stages of development, in 2006, from trees grown under greenhouse conditions at Tours, France and kept at -80°C before use. Two independent sets of leaves were harvested. The development stages of the leaves are defined as follows: Very Young Leaves (VYL), Young Leaves (YL), Mature leaves (ML), Old Leaves (OL). The sizes of the leaves collected were: approximately 2-3 cm for VYL stage; 6-9 cm for YL stage and 12-15 cm for ML and OL stages.
The samples from the various tissues were reduced to a powder in a SPEX CertiPrep 6800 Freezer Mill with liquid nitrogen and the powders were then stored at -80°C until total RNA was extracted. In the case of the coffee cherries at different stages of development, these were first separated into pericarp and grain tissues and then each was very rapidly reduced to a powder and stored as described above. RNA were extracted and purified from the stored powders using the RNeasy Plant mini kit (QIAGEN) that included a DNase treatment using the manufacturer's instructions. The quality of the final RNA samples obtained were checked by agarose gel electrophoresis and ethidium bromide staining.
The method used to make the cDNA was very similar to the protocol described in the Transcriptor Reverse Transcriptase kit (Roche) using around 1 μg total RNA sample and 870 ng of oligo dT(18) (Proligo), with reactions performed at 55°C for 30 min. The cDNA samples generated were then diluted one hundred fold in sterilized water and aliquots were stored at -20°C for later use in QPCR experiments.
DNA sequence analysis
Plasmid DNA was purified using Qiagen kits according to the instructions given by the manufacturer. Prepared plasmid DNA was then sequenced by the dideoxy termination method . Computer analyses were performed using the Laser Gene software package (DNASTAR, version 7.1.0).
Real time QRT-PCR experiments
Primer and TAQMAN probes used for the quantitative PCR experiments.
Primers and Probes
5' ATGCGCACTGACAACA 3'
5' TCTGCTCTTCAGAGGTTGTA 3'
5' TGCTGCTGAAGGCG 3'
5' TTAGTACCTTTCAGTGCAAAT 3'
5' TAAAGCTAACGCGTAAATG 3'
5' CTTCTGCCTTTCCC 3'
5' TGCATGGATGATGTACTG 3'
Quantification was carried out by the method of relative quantification, using the constitutively expressed ribosomal protein rpl39 as the reference. In order to use the method of relative quantification, it was necessary to show that the amplification efficiency for the different gene sequences were roughly equivalent to the amplification efficiency of the reference sequence (rpl39 cDNA sequence) using each specifically defined primer and probe sets. To determine this relative equivalence, plasmid DNA containing the appropriate cDNA sequences were diluted 1/1000, 1/10,000, 1/100,000, and 1/1,000,000 fold, and using the QPCR conditions described above, the efficiencies of amplification were calculated. All the primer/probe sets showed acceptable efficiencies.
Production of recombinant Coffea canephora CcCP1 and CcCP4 in E. coli
Expression vectors were generated using the "Champion™ pET SUMO Protein Expression System" (Invitrogen). The CcCP1 sequence minus its N-terminal 28 amino acids was amplified by PCR as follows: 50 μl reactions contained the plasmid pA4-43, 5 μL of TaKaRa® DNA Polymerase 10X LA PCR® Buffer, 600 μM of each CcCP1 specific primers (CP1-FP 5'ATGTTCCAACATGAAATTCAGTATC3' and CP1-RP 5'TCAAGAGGTCTGTGTCACCA3'), 200 μM each dNTP, and 0.5 U of TaKaRa DNA Polymerase (Takara Bio Inc). The PCR cycling conditions were as follows: 94°C for 2 min; then 35 cycles of 94°C 1 min, 55°C 1.5 min, and 72°C 1.5 min followed by a final step at 72°C 7 min. The PCR product was then gel purified. The CcCP4 sequence minus its N-terminal 22 amino acids was produced as described for the CcCP1 insert except the initial DNA substrate was plasmid pcccs46w7n5, and the specific primers were (CP4-FP 5'ATGGAGATCACAGAAAGAGATT3' and CP4-RP 5' CTAGAGGTCGTCCTTAGGT3').
The gel purified fragments were then cloned into the TA cloning site of the pET-SUMO vector, as recommended by the vector manufacturer. Ligated plasmids were transformed into One Shot® Mach1™-T1R Chemically Competent Cells (Invitrogen). Clones with the inserts in the correct orientation were selected by PCR screening and the plasmid containing CcCP1 was named pNM17 and the plasmid containing CcCP4 was named pNM6.
For protein expression, BL21(DE3) One shot® Chemically Competent Cells (Invitrogen) were transformed with pNM17 and pNM6 plasmids as recommended in the manufacturer's protocol. Five ml overnight cultures of the selected transformants were used to inoculate 100 ml cultures of LB medium containing 50 μg/ml kanamycin (except control, i.e.: untransformed BL21(DE3) cells). The cells were grown at 37° and 200 rpm shaking to an OD600 of 0.4-0.6. Then, 90 ml was taken and "induced" by addition of IPTG (1 mM final). Both "Induced" and "Not Induced" cultures were further incubated at 37°C (200 rpm shaking) for 5.5 h, followed by centrifugation at 6000 g for 10 min at 4°C. Cell pellets were resuspended at room temperature in BugBuster® Protein Extraction Reagent ((Novagen)) using 5 ml reagent per gram of wet cell paste. Then, 25 U benzonase nuclease (Novagen) and 1 KU rLysozyme solution (Novagen) were added per 1 mL Bugbuster and incubated 25 min at 70 rpm, at room temperature, followed by centrifugation at 6000 g for 30 min at 4°C.
The pellets obtained from the induced cultures, which contained the inclusion bodies, were again resuspended in BugBuster® solution using the same volume that was used to resuspend the initial cell pellet (5 mL per gram of initial wet cell pellet) and 1 KU rLysozyme solution was added per 1 mL BugBuster® and the mixture was incubated at room temperature for 5 min. Then, 6 volumes of 1/10 diluted BugBuster® solution was added and the tubes vortexed for 1 min. The resulting suspensions were centrifuged at 5000 g for 15 min at 4°C. The "washed" inclusion bodies collected were resuspended in 7 volumes of 1/10 diluted BugBuster® solution and centrifuged as previously. This wash step was repeated three more times to remove non-specific material associated with the inclusion bodies. The final pellets of the purified inclusion bodies (IBS) obtained were resuspended in 2 volumes of denaturing buffer A (8 M urea, 50 mM NaH2PO4, 10 mM Tris-HCl adjusted to pH8 with NaOH) and incubated at 28°C for 1 h as described by Zhang et al. , 2 volumes of buffer B (8 M urea, 50 mM NaH2PO4, 10 mM Tris-HCl adjusted to pH6.3 with HCl) were then added and purification was carried out with Ni-NTA Superflow Columns (QIAGEN). Briefly, the Ni-NTA slurry was mixed with the denatured proteins by shaking on a rotary shaker for 1 h at 70 rpm at room temperature, followed by a loading of the slurry on an empty column and collecting the flow-through. Two successive column washes were carried out with 1.2 volumes Buffer B, followed by elution of the recombinant proteins with 0.6 volumes of buffer C (8 M urea, 50 mM NaH2PO4, 10 mM Tris-HCl adjusted to pH5.9 with HCl), giving a fraction called El1. This was followed by 2 elutions of 0.6 volumes with buffer D (8 M urea, 50 mM NaH2PO4, 10 mM Tris-HCl adjusted to pH4.5 with HCl) giving fractions El2 and El3. 20 μl samples of the different fractions were analysed by SDS-PAGE gel electrophoresis and those containing recombinant proteins were pooled.
Names and composition of dialysis buffers
Names of dialysis buffers
Buffer 6 M
50 mM potassium phosphate pH10.7, 5 mM EDTA, 1 mM reduced glutathione, 0.1 mM oxidized glutathione, 6 M urea
Buffer 4 M
50 mM potassium phosphate pH10.7, 5 mM EDTA, 1 mM reduced glutathione, 0.1 mM oxidized glutathione, 4 M urea
Buffer 2 M
50 mM potassium phosphate pH10.7, 5 mM EDTA, 1 mM reduced glutathione, 0.1 mM oxidized glutathione, 2 M urea
Buffer 0 M
50 mM potassium phosphate pH10.7, 5 mM EDTA, 1 mM reduced glutathione, 0.1 mM oxidized glutathione, 0 M urea
Assay for cysteine protease activity
The assay for cysteine protease activity used here is a slight modification of the one developed by Zhang et al.  and Troen et al. . Protein samples made up to 10 μl with water were mixed with 20 μL 50 mM sodium formate buffer (pH3), then incubated 30 sec at 37°C for activation, with parallel "non-activated" control reactions set up with samples in which 20 μL milliQ pure water replaced the 20 μL sodium formate buffer. This was immediately followed by the addition of 6.7 μl of the "reaction mix" (1% BSA, 1xPBS and 6 mM L-cysteine, pH7.5). The enzyme reactions were subsequently incubated at 37°C and 3 μL aliquots were taken at different times and added to sample loading buffer, heated for 7 min at 95°C, then run on SDS-PAGE gels and stained with coomassie.
Identification of cysteine proteinase sequences expressed during coffee grain maturation
Coffee cDNA encoding cysteine proteinases were found by carrying out BLAST searches against the Nestlé/Boyce Thompson Institute coffee EST database (Coffea canephora built #3, located at http://solgenomics.net/)  using the protein sequences of two biochemically characterized cysteine proteinases: NtCP56-KDEL from Nicotiana tabacum (genbank accession number ACB70409) which is a peptidase from the C1A subfamily (MEROPS database nomenclature; http://merops.sanger.ac.uk), and SlCP from Solanum lycopersicum (genbank accession number CAH56498) which is from the peptidase C13 family (asparaginyl endopeptidase, cysteine catalytic type). This analysis yielded 15 candidate unigene sequences (data not shown). As our main objective was to study genes highly and specifically expressed in the maturing grain, we examined the "in-silico" expression profiles associated with these unigenes. Three unigenes (SGN-U613831, SGN-U613447 and SGN-U620235) were found that exhibited multiple ESTs and all were found in either grain or cherry EST libraries (data not shown). Further analysis indicated that two of the unigenes (SGN-U613447 and SGN-U620235) were probably different alleles of the same gene, thus giving only two, clearly different unigenes with high expression in the grain for further study. Plasmids potentially containing the longest sequences for each unigene were then selected from our available EST libraries and fully sequenced to confirm the "in-silico" unigene sequences.
In the case of Unigene SGN-U613831, the plasmid pA4-43 was selected to characterize the first Coffea canephora CP cDNA, which we named CcCP1; the cDNA has a 1511 bp long insert, with a 1194 bp coding sequence (CDS) encoding a protein of 397 amino acids. Analysis of the protein sequence of CcCP1, performed using SignalP 3.0 server (http://www.cbs.dtu.dk/services/SignalP/) and the "Conserved Domain Database" (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml), shows it is a member of the peptidase_C1 superfamily/peptidase_C1A subfamily (papain family, clan CA) and it appears to have at least three distinct domains, a hydrophobic N-terminal signal peptide, followed by a predicted 56 amino acid long I29 inhibitor propeptide domain (with an ERFN(V/I)N like domain sequence) and a peptidase domain containing the catalytic triad of Cys, His, and Asn, as well as an active site Gln residue. The second coffee CP cDNA to be studied in detail has a 1365 bp long insert encoding a protein of 359 amino acids which we have called CcCP4 (pcccs46w7n5). The protein sequence indicates that CcCP4 is also a member of the C1 peptidase superfamily (C1A subfamily) and has the three distinct domains already described for CP1, plus an obvious ERFNIN like sequence which is located within the 55 AA long predicted I29 inhibitor domain.
Quantitative gene expression analysis for CP1 and CP4 in different coffee tissues and during grain development/germination
CP1 and CP4 gene expression was also measured in an independent set of Robusta samples (data not shown). The results from this second sample set were globally in agreement with the results presented in Figure 2, with a few minor differences. For example, for the second independent set of BP409 samples covering grain development, overall CP1 expression was lower (RQ = 1.54 versus RQ = 6.73 at mature red stage) and CP4 expression was found to be higher at an earlier stage in this second grain sample set. It is likely that some of these transcript level differences result from slight differences in the precise development stages of the various samples. Comparison with a second germination sample set (from Robusta variety FRT05) showed that CP1 and CP4 expression were broadly similar, with the exception that the BP409 sample set showed increased CP1 transcript levels from T1 to T 3, but the FRT05 samples set shows the levels of CP1 transcripts are highest at T1 and then fall. Nonetheless, for both sample sets, CP1 transcripts levels are very low at T5. In both germination sample sets, CP4 transcript levels are low at T1 and rose to a maximum at T5. CP1 and CP4 transcripts were barely detected at T6 for the second sample set. Interestingly, the CP4 transcript levels were significantly higher in all the Robusta FRT05 samples examined during pre-germination/germination/post germination (including T1 to T5 samples) versus the equivalent samples of Robusta BP409.
Production of recombinant CP1 and CP4 enzymes in E. coli: purification and activity testing
Identification of cysteine proteinase inhibitors expressed during grain maturation
To find cDNA encoding coffee cysteine proteinase inhibitors, we carried out BLAST searches using the biochemically characterized Helianthus annuus cysteine protease inhibitor HaCPI (accession number JE0308 ), the Dianthus caryophyllus cysteine protease inhibitor DcCPIn (accession number AAK30004 ) and a putative cystatin-like inhibitor sequence from Citrus × paradisi (accession number AAG38521), as the query sequences. Using these criteria, 6 distinct unigene sequences were found (data not shown). 4 of these sequences were chosen for further study in this work. Plasmids potentially containing a complete ORF representing each unigene were isolated from the available EST libraries and fully sequenced to confirm the "in-silico" unigenes sequences. The respective gene sequences have been named CcCPI-1, CcCPI-2, CcCPI-3, and CcCPI-4. The plasmid names and the size of their inserts are presented in the Additional file 5.
Quantitative gene expression analysis for CPI-1, CPI-2, CPI-3 and CPI-4 in different Robusta tissues and during grain development/germination
Both the CPI-2 and CPI-3 genes were found to be expressed to a small extent in all the tissues examined (Figure 5). A second independent RNA set showed a similar expression profile (data not shown). Thus, these two CPI genes do not appear to have any clear tissue specificity. Detailed examination of the QRT-PCR results in Figure 5 show however that some expression differences exist for these two genes, for example there are higher CPI-2 transcript levels found at the SG grain stage (high RQGSG = 1.71 for CPI-2 in Figure 5) and are slightly higher in leaves (both Robusta sample sets). One can also observe that CPI-3 transcript levels are very low at the T5 germination stage (this was also seen in the second set of Robusta samples). Another difference is the relatively high CPI-3 expression associated with the mature red coffee pericarp tissue (RQPR = 2.29). Interestingly, the higher expression of CPI-3 is most noticeable in the sample used for Figure 5. As this sample could be more mature (and thus softer) than the other "mature" pericarp sample tested, it is entirely possible that increased CPI-3 expression is coupled with pericarp age and fruit softening. It will be interesting in the future to explore this possibility further by carrying out more detailed expression studies on this gene at the end of pericarp maturation, and to explore whether CPI-1 could also be involved (which also appears to rise somewhat at this stage).
The QRT-PCR results in Figure 5 show that there is little or no expression of CPI-4 in the developing grain or in the pericarp at any maturation stage examined (0 < RQs < 0.02). This was also observed with the second Robusta sample set studied (data not shown). However, Figure 5 shows that significant levels of CPI-4 transcripts occur in Robusta roots (RQroots = 0.53), leaves (Rleaves = 0.20) and branches (RQbranches = 0.08). In contrast however, few CPI-4 transcripts were detected in the roots and branches of the second Robusta sample (data not shown), suggesting some possible maturity or other tissue sampling related differences could be involved. In the case of leaves, relatively low CPI-4 expression was seen in the leaves of the BP409 Robusta sample set used for Figure 5, but very high levels were seen for this gene in the leaves of the BP409 Robusta sample used for the second sample set (data not shown). This expression difference, which was hypothesized to be due to leaf maturity, is explored in more detail below. Another surprising aspect of the CPI-4 expression data obtained is the extremely high level of CPI-4 expression detected during the last post-germination stages examined (Figure 5: T5 stage, RQ = 4.59 and T6 stage consisting primarily of the first two young leaves/cotyledons with RQ = 31.97). There was no significant expression of CPI-4 from germination stages T1 to T4 (14DAI, grain showing its first sign of radicle emergence). The same results were seen in the second Robusta sample set analysed (FRT05; data not shown). This suggests CPI-4 could play some important role during this period of plantlet growth/emergence.
Quantitative gene expression analysis of the cysteine protease inhibitor genes CPI-1--CPI-4 at different stages of leaf development
Cysteine proteases and their inhibitors have been studied in detail for many plants, but, to date, little work has been done on these genes or proteins from coffee. Here we describe full length cDNA representing two cysteine proteinase genes, called CcCP1 and CcCP4, which show relatively exclusive expression during grain maturation and germination. We also present the characterization of full length cDNA representing four cysteine proteinase inhibitor genes (CPI-1--CPI-4) and describe the quantitative expression of these genes in coffee.
Sequence comparisons indicated that the two Robusta cysteine proteinases described represent different CP proteinases; Blast analysis against the protein database indicates that CP1 is very closely related to a papain type CP called VsCPR4 from Vicia sativa (GenBank Accession CAB16316), as well as a putative CP of Arabidopsis (AT3G54940, GenBank Accession NP_567010). Examination of Figure 1A shows the protein sequence of CcCP1 and its putative homologues contain a partial ERFNIN box (ERFNAQ) within an N-terminal cathepsin propeptide inhibitor domain, indicating that these polypeptides fall within the CP subfamily having an I29 domain that is found at the N-terminus of some C1 peptidases, like Cathepsin L, where it acts as a propeptide. (http://merops.sanger.ac.uk; http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). The I29 domain of CcCP1 and its putative homologues are located just upstream of a clear peptidase C1 superfamily domain which have strong homologies with peptidase C1A_cathepsins_B/C/X. Several other CP specific elements are also completely conserved in the three highly similar protein sequences presented in Figure 1A. The CPR4 polypeptide of Vicia sativa has been shown to be expressed during seed maturation and during the early part of seedling germination/growth in both the embryonic axis and the cotyledons . Although the functional activity of this protein has not been proven in a recombinant form, these authors nevertheless implied the processed, and thus presumably activated polypeptide, was involved in seed storage protein mobilization. Microarray expression analysis of the potential Arabidopsis CP1 homologue (AT3G54940) indicates this gene has significant expression in developing endosperm and in the embryos of developing and germinating seedlings (https://www.genevestigator.com), supporting the idea that these highly related polypeptides play a role in storage protein modification and/or mobilization. Little expression was seen for the Arabidopsis gene in other tissues under normal conditions. The expression profile of CcCP1 transcripts (Figure 2) mirrors the expression of the candidate homologues VsCPR4 and AT3G54940, ie, CcCP1 is expressed in the later stages of grain development and during germination, and implying that CcCP1 performs a similar function as the putative homologue of the other two plants. To date, it has apparently not been possible to express and/or correctly activate any of the recombinant CcCP1 homologues in order to confirm their function. The most likely explanation for this inability to verify the activity of these proteins is that the precise conditions needed to process/activate these proteins have not yet been identified.
The alignment of CcCP4 with two of the closest well characterized plant sequences (NtCP56, and SlCysEP) indicates that CcCP4 and its proposed homologues have a clear ERFNI/VN box within an N-terminal I29 type propeptide domain, followed by a peptidase C1 superfamily domain containing conserved cysteine proteinase specific sequence elements (see Figure 1B for details). All three polypeptides contain N-terminal signal sequences, and the two homologues have a C-terminal endoplasmic reticulum retention sequence (KDEL). Interestingly, the CcCP4 cDNA sequence characterized contains the C-terminal sequence KDDL. In order to explore whether the KDDL sequence was unusual in coffee, we examined the sequences of other cDNA in the coffee unigene (SGN-U613447). This unigene, which is the only clear hit obtained when the Coffea canephora Unigene set at http://solgenomics.net is blasted with the CcCP4 protein sequence, has 22 ESTs. Analysis of these ESTs showed that 14 have sequence data for the C-terminal end, and, interestingly 2 of these end with the KDEL sequence. This observation suggests that Robusta may have both KDDL and KDEL alleles of CcCP4. A preliminary PCR analysis of genomic fragments from this region of the CcCP4 gene in Coffea eugenoides and Coffea arabica suggests that the KDEL allele is more prominent in these species (data not shown). The significance of CP4 alleles with a C-terminal KDDL sequence is currently unclear. However, the fact that several other plant sequences in the protein database related to CcCP4 also have C-terminal KDEL or RDEL sequences (data not shown) suggests this is the more prominent form found in plants. Also, several groups have proposed that unprocessed CP-KDEL proteins are retained in the endoplasmic reticulum (ER) after synthesis, and are only processed/transported further upon specific signaling [34–36], and another group suggested that C-terminal KDDL proteins could be poorly retained in the ER . These observations raise the possibility that an expressed CcCP4-KDDL protein might be poorly retained in the ER and thus could exist in unintended compartments of developing coffee grain cells, with unknown consequences. Future experiments comparing physiological or other differences between seeds of Robusta trees homozygous for the CcCP4-KDDL or CcCP4-KDEL genes could be illuminating.
Both the proposed tobacco and tomato CcCP4 homologues are known to be involved in pollen development. Zhang et al.  confirmed that the tobacco protein (NtCP56) encoded a functional, acid activated CP proteinase and then went on to show that anti-sense suppression of this gene can disrupt normal pollen development and cause male sterility. The tomato SlCysEP gene product was also shown to encode an acid activated CP proteinase and to be an important component of the tomato ricinosome, which is a subcellular structure believed to orchestrate the final processing/recycling of cellular proteins during plant programmed cell death . In each case, the recombinant CP polypeptides produced in E. coli were insoluble, and, as shown here for the coffee CcCP4, needed to be refolded to demonstrate auto-cleavage and cysteine protease activity. Analysis of SlCysEP transcripts showed that they could be detected in flowers at a specific period, and that this expression was primarily limited to the stamens . While the expression of NtCP56 or SlCysEP was not studied in seeds, our examination of ESTs encoding SlCysEP (http://solgenomics.net) confirmed that cDNA representing this gene can be found in Solanum lycopersicum EST banks from fruit, seeds, young leaves, as well as flowers (with seed libraries having the highest number of ESTs). Tomato database analysis indicates the SlCysEP gene has three introns, and that two other highly related KDEL containing "unigene" sequences can be found which potentially represent other members of this specific CP gene family. Three potential CcCP4 homologues were also identified in the Arabidopsis genome (AT5G50260, AT3G48350, and AT3G48340). Examination of the expression patterns for these genes using microarray data  showed that AT5G50260 expression was limited to seeds, silique and stamen/anther, although lower levels could also be found in roots, but not in stems, from plants subjected to osmotic stress. Interestingly, some induction of AT5G50260 also appeared in nematode infested roots. No significant expression was seen for this gene in other tissues; in contrast, low levels of expression were found for the Arabidopsis CcCP4 like AT3G48350 gene in many tissues, suggesting this gene may play a more general role in plant cells. The only two situations that appeared to increase AT3G48350 transcript levels were treatment with uv and dramatic changes in light conditions (dark/light shifts). No probe sets were identified for the gene sequence AT3G48340, so the expression of this gene is not known.
Overall, the CcCP4 expression data presented are consistent with our proposal that CcCP4 may be involved in the PCD associated with coffee grain germination and post-germination stages. Although no significant CcCP4 expression was detected in the Robusta BP409 flower sample tested (data not shown), this may be due to the limited developmental time frame we analysed. New analysis, using several different stages of flower development is clearly needed to clarify the expected participation of CcCP4 in coffee pollen development. It is interesting to note the Vicia sativa Proteinase A (CcCP4 homologue) was not detected during vetch seed development, but was detected in the cotyledons in the later stages of germination and post-germination (was also not detected in the seedling axis) [29, 38]. These observations, together with the fact that purified Vicia sativa Proteinase A was capable of completely digesting the vetch storage proteins vicilin and legumin, led the authors to propose that Proteinase A was not involved in seed development or in the early part of storage protein mobilization, but was important for later stages of germination which involved much more extensive proteolysis . This contrasts with coffee where there appears to be two periods of CcCP4 expression, one in the developing grain (which may continue into the first part of germination), and another new burst of transcription beginning around the T2 stage of germination up to T5 stage. We currently do not know the significance of finding low CcCP4 expression in the developing grain, although we do note that the "absence" of Protein A in Vicia sativa in developing seeds  could be due to the less sensitive detection method used earlier (northern blotting versus QRT-PCR here).
The quantitative expression analysis of the four CPI genes (Figure 5) showed that CcCPI-2 and CcCPI-3 are expressed in most tissues and that their levels of expression do not vary broadly. In contrast, CPI-1 had increasingly higher expression as grain development progresses (> 100 fold increase from immature to mature stages) and also showed relatively strong expression during the T2 to T5 stages of grain germination and post germination. Little CcCPI-1 expression was detected in the other tissues tested. For CcCPI-4, extremely high levels of transcripts were seen exclusively in the T5 and T6 stages of post-germination, corresponding to stages in which the cotyledons are forming. The significance of this observation is not known, but one interesting line of future investigation will be to determine whether CPI-4 expression contributes to insect tolerance/resistance at this delicate stage of plantlet development. By examining gene expression at different stages of leaf development, we also found that while CcCPI-4 is weakly expressed in young leaves, its expression increases dramatically in mature leaves (Figure 6). No significant expression of CcCPI-4 was found for the other tissues tested, except a low level in the roots from Robusta BP409 cDNA set used in Figure 5 (RQ = 0.53), which raises the interesting possibility that the higher levels of one or more CcCPI proteins could reduce damage by root pests like nematodes. Overall, the coffee CPI gene expression data suggest that CPI-2 and CPI-3 could be CP inhibitors with mostly "house-keeping" functions, while CPI-1 may play an important role during grain development, and CPI-4 could contribute to reducing damage by insects during the early life of the plantlets (first cotyledons), and perhaps in mature/old leaves and roots. Finally, as the peptide/amino acid profile of a coffee has an important impact on flavour and aroma generation during coffee grain roasting [24, 39], further research is warranted to investigate possible links that may exist between the allelic variation in genes encoding coffee cysteine proteinases and cysteine proteinase inhibitors and the flavour/aroma quality associated with the grain of different coffee varieties.
Several cysteine proteinase and cysteine proteinase inhibitor genes with strong, relatively specific expression during coffee grain maturation and germination are presented. The temporal expression of the CcCP1 gene suggests it is involved in modifying proteins during late grain maturation and germination. The expression pattern of CcCP4, and its close identity with KDEL containing CP proteins, implies this proteinase may play a role in protein and/or cell remodelling during late grain germination, and that it is likely to play a strong role in the programmed cell death associated with post-germination of the coffee grain. Expression analysis of the cysteine proteinase inhibitor genes suggests that CcCPI-1 could primarily be involved in modulating the activity of grain CP activity; while CcCPI-4 may play roles modulating grain CP activity and in the protection of the young coffee seedlings from insects and pathogens. CcCPI-2 and CcCPI-3, having lower and more widespread expression, could be more general "house-keeping" CPI genes. The data generated opens up new avenues to explore the potential contribution of proteinases to coffee quality and facilitates new research to investigate the possibility that coffee cysteine proteinase inhibitors may help reduce damage caused by some plant pests.
GenBank accession numbers
Coffea canephora protein sequence data associated with this work article has been deposited in GenBank under following accession numbers: CcCPI-1 (AEQ54766), CcCPI-2 (AEQ54767), CcCPI-3 (AEQ54768), CcCPI-4 (AEQ54769), CcCP1 (AEQ54770), CcCP4 (KDDL-tailed) (AEQ54771) and CcCP4 (partial--KDEL-tailed) (AEQ54772).
Cysteine proteinase inhibitor
Quantitative reverse transcriptase-PCR
Programmed cell death
Protein storage vesicles
Days after imbibition
We wish to thank C. Lin and S. Tanksley for generating the Cornell EST clones described here, L. Mueller for bioinformatics and managing the coffee database within SGN, and Jérôme Spiral for providing grain samples for the germination-post germination gene expression analysis. We also thank Thomas Vinos Poyo for generating some of the QPCR data and Vincent Denis for the early work on CcCP1. Finally, we thank Vincent Pétiard and Pierre Broun for supporting the work.
- Rawlings ND, Morton FR: The MEROPS batch BLAST: a tool to detect peptidases and their non-peptidase homologues in a genome. Biochimie. 2008, 90: 243-259. 10.1016/j.biochi.2007.09.014.PubMedView ArticleGoogle Scholar
- Rawlings ND, Barrett AJ, Bateman A: MEROPS: the peptidase database. Nucleic Acids Res. 2010, 38: D227-D233. 10.1093/nar/gkp971.PubMedPubMed CentralView ArticleGoogle Scholar
- Garcia-Lorenzo M, Sjödin A, Jansson S, Funk C: Protease gene families in Populu and Arabidopsis. BMC Plant Biol. 2006, 6: 1-24. 10.1186/1471-2229-6-1.View ArticleGoogle Scholar
- Zhang XM, Wang Y, Lv XM, Li H, Sun P, Lu H, Li FL: NtCP56, a new cysteine protease in Nicotiana tabacum L., involved in pollen grain development. J Exp Bot. 2009, 60: 1569-1577. 10.1093/jxb/erp022.PubMedPubMed CentralView ArticleGoogle Scholar
- Senatore A, Trobacher CP, Greenwood JS: Ricinosomes predict programmed cell death leading to anther dehiscence in tomato. Plant Physiol. 2009, 149: 775-790.PubMedPubMed CentralView ArticleGoogle Scholar
- Tian Q, Olsen L, Sun B, Lid SE, Brown R, Lemmon B, Fosnes K, Gruis D, Opsahl-Sortberg HG, Otegui M, Olsen OA: Subcellular localization and functional domain studies of DEFECTIVE KERNAL1 in Maize and Arabidopsi suggest a model for aleurone cell fate specification involving CRINKLY4 and SUPERMUMERARY ALEURONE LAYER1. Plant Cell. 2007, 19: 3127-3145. 10.1105/tpc.106.048868.PubMedPubMed CentralView ArticleGoogle Scholar
- Johnson KL, Faulkner C, Jeffree CE, Ingram GC: The phytocalpain defective kernel 1 is a novel Arabidopsis growth regulator whose activity is regulated by proteolytic processing. Plant Cell. 2008, 20: 2619-2630. 10.1105/tpc.108.059964.PubMedPubMed CentralView ArticleGoogle Scholar
- Chichkova N, Kim S, Titova E, Kalkum M, Morozov V, Rubtsov Y, Kalininia N, Taliansky M, Vartapetian A: A plant caspase-like protease activated during the hypersensitive response. Plant Cell. 2004, 16: 157-171. 10.1105/tpc.017889.PubMedPubMed CentralView ArticleGoogle Scholar
- Bernoux M, Timmers T, Jauneau A, Briére C, de Wit P, Marco Y, Deslandes L: RD-19, an arabidopsis cysteine protease required for RRS1-R-mediated resistance, is relocalized to the nucleus by the Ralstonia solanacearum PopP2 effector. Plant Cell. 2008, 20: 2252-2264. 10.1105/tpc.108.058685.PubMedPubMed CentralView ArticleGoogle Scholar
- Bozhkov P, Suarez M, Filonova L, Daniel G, Zamyatnin A, Rdoriguez-Nieto S, Zhivotovsky B, Smertenko A: Cysteine protease mcII-Pa executes programmed cell death during plant embryogenesis. Proc Natl Acad Sci USA. 2005, 102: 14463-14468. 10.1073/pnas.0506948102.PubMedPubMed CentralView ArticleGoogle Scholar
- Nakaune S, Yamada K, Kondo M, Kato I, Tabata S, Nishimura M, Hara-Nishimura I: A vacuolar processing enzyme, delta-VPE, is involved in seed coat formation at the early stage of seed development. Plant Cell. 2005, 17: 876-887. 10.1105/tpc.104.026872.PubMedPubMed CentralView ArticleGoogle Scholar
- Gruis F, Schulze J, Jung R: Storage protein accumulation in the absence of the vacuolar processing enzyme family of cysteine proteinases. Plant Cell. 2004, 16: 270-290. 10.1105/tpc.016378.PubMedPubMed CentralView ArticleGoogle Scholar
- Sreenivasulu N, Usadel B, Winter A, Radchuk V, Scholz U, Stein N, Weschke W, Strickert M, Close TJ, Stitt M, Graner A, Wobus U: Barley grain maturation and germination: metabolic pathway and regulatory network commonalities and differences highlighted by new MapMan/PageMan profiling tools. Plant Physiol. 2008, 146: 1738-1758. 10.1104/pp.107.111781.PubMedPubMed CentralView ArticleGoogle Scholar
- Martinez M, Cambra I, Carrillo L, az-Mendoza M, Diaz I: Characterization of the entire cystatin gene family in barley and their target cathepsin L-like cysteine-proteases, partners in the hordein mobilization during seed germination. Plant Physiol. 2009, 151: 1531-1545. 10.1104/pp.109.146019.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang J, Li Y, Lo S, Hillmer S, Sun S, Robinson D, Jiang L: Protein mobilization in germinating mung bean seeds involves vacuolar sorting receptors and multivesicular bodies. Plant Physiol. 2007, 143: 1628-1639. 10.1104/pp.107.096263.PubMedPubMed CentralView ArticleGoogle Scholar
- Arai S, Matsumoto I, Emori Y, Abe K: Plant Seed Cystatins and their target enzymes of endogenous and exogenous origin. J Agric Food Chem. 2002, 50: 6612-6617. 10.1021/jf0201935.PubMedView ArticleGoogle Scholar
- Martinez M, Diaz I: The origin and evolution of plant cystatins and their target cysteine proteinases indicate a complex functional relationship. BMC Evol Biol. 2008, 8: 198-210. 10.1186/1471-2148-8-198.PubMedPubMed CentralView ArticleGoogle Scholar
- Schluter U, Benchabane M, Munger A, Kiggundu A, Vorster J, Goulet MC, Cloutier C, Michaud D: Recombinant protease inhibitors for herbivore pest control: a multitrophic perspective. J Exp Bot. 2010, 61: 4169-4183. 10.1093/jxb/erq166.PubMedView ArticleGoogle Scholar
- Benchabane M, Schluter U, Vorster J, Goulet MC, Michaud D: Plant cystatins. Biochimie. 2010, 92: 1657-1666. 10.1016/j.biochi.2010.06.006.PubMedView ArticleGoogle Scholar
- Urwin P, Green J, Atkinson H: Expression of a plant cystatin confers partial resistance to Globodera, full resistance is achieved by pyramiding a cystatin with natural resistance. Mol Breed. 2003, 12: 263-269. 10.1023/A:1026352620308.View ArticleGoogle Scholar
- Abdeen A, Virgos A, Olivella E, Villanueva J, Aviles X, Gabarra R, Prat S: Multiple insect resistance in transgenic tomato plants over-expressing two families of plant proteinase inhibitors. Plant Mol Biol. 2005, 57: 189-202. 10.1007/s11103-004-6959-9.PubMedView ArticleGoogle Scholar
- Martinez M, Abraham Z, Gambardella M, Echaide M, Carbonero P, Diaz I: The strawberry gene Cyf1 encodes a phytocystatin with antifungal properties. J Exp Bot. 2005, 56: 1821-1829. 10.1093/jxb/eri172.PubMedView ArticleGoogle Scholar
- Ho CT, Hwang HI, Yu TH, Zhang J: An overview of the Maillard reactions related to aroma generation in coffee. Proceedings of the 15th International Conference on Coffee Science, ASIC, Montpellier, France. 1993, 519-527.Google Scholar
- Yeretzian C, Jordan A, Badoud R, Lindinger W: From the green bean to the cup of coffee: investigating coffee roasting by on-line monitoring of volatiles. Eur Food Res Technol. 2002, 214: 92-104. 10.1007/s00217-001-0424-7.View ArticleGoogle Scholar
- Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977, 74: 5463-5467. 10.1073/pnas.74.12.5463.PubMedPubMed CentralView ArticleGoogle Scholar
- Simkin AJ, Qian T, Caillet V, Michoux F, Ben AM, Lin C, Tanksley S, McCarthy J: Oleosin gene family of Coffea canephora: quantitative expression analysis of five oleosin genes in developing and germinating coffee grain. J Plant Physiol. 2006, 163: 691-708. 10.1016/j.jplph.2005.11.008.PubMedView ArticleGoogle Scholar
- Troen BR, Ascherman D, Atlas D, Gottesman MM: Cloning and expression of the gene for the major excreted protein of transformed mouse fibroblasts. A secreted lysosomal protease regulated by transformation. J Biol Chem. 1988, 263: 254-261.PubMedGoogle Scholar
- Lin C, Mueller LA, Mc CJ, Crouzillat D, Petiard V, Tanksley SD: Coffee and tomato share common gene repertoires as revealed by deep sequencing of seed and cherry transcripts. Theor Appl Genet. 2005, 112: 114-130. 10.1007/s00122-005-0112-2.PubMedPubMed CentralView ArticleGoogle Scholar
- Muntz K, Belozersky MA, Dunaevsky YE, Schlereth A, Tiedemann J: Stored proteinases and the initiation of storage protein mobilization in seeds during germination and seedling growth. J Exp Bot. 2001, 52: 1741-1752. 10.1093/jexbot/52.362.1741.PubMedView ArticleGoogle Scholar
- Doi-Kawano K, Kouzuma Y, Yamasaki N, Kimura M: Molecular cloning, functional expression, and mutagenesis of cDNA encoding a cysteine proteinase inhibitor from sunflower seeds. J Biochem. 1998, 124: 911-916.PubMedView ArticleGoogle Scholar
- Sugawara H, Shibuya K, Yoshioka T, Hashiba T, Satoh S: Is a cysteine proteinase inhibitor involved in the regulation of petal wilting in senescing carnation (Dianthus caryophyllus L.) flowers?. J Exp Bot. 2002, 53: 407-413. 10.1093/jexbot/53.368.407.PubMedView ArticleGoogle Scholar
- Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Liebert CA, Liu C, Lu F, Lu S, Marchler GH, Mullokandov M, Song JS, Tasneem A, Thanki N, Yamashita RA, Zhang D, Zhang N, Bryant SH: CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 2009, 37: D205-D210. 10.1093/nar/gkn845.PubMedPubMed CentralView ArticleGoogle Scholar
- Hruz T, Laule O, Szabo G, Wessendorp F, Bleuler S, Oertle L, Widmayer P, Gruissem W, Zimmermann P: Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics. 2008, 2008: 420747-420751.PubMedPubMed CentralView ArticleGoogle Scholar
- Gietl C, Schmid M: Ricinosomes: an organelle for developmentally regulated programmed cell death in senescing plant tissues. Naturwissenschaften. 2001, 88: 49-58. 10.1007/s001140000203.PubMedView ArticleGoogle Scholar
- Schmid M, Simpson DJ, Sarioglu H, Lottspeich F, Gietl C: The ricinosomes of senescing plant tissue bud from the endoplasmic reticulum. Proc Natl Acad Sci USA. 2001, 98: 5353-5358. 10.1073/pnas.061038298.PubMedPubMed CentralView ArticleGoogle Scholar
- Okamoto T, Shimada T, Hara-Nishimura I, Nishimura M, Minamikawa T: C-terminal KDEL sequence of a KDEL-tailed cysteine proteinase (sulfhydryl-endopeptidase) is involved in formation of KDEL vesicle and in efficient vacuolar transport of sulfhydryl-endopeptidase. Plant Physiol. 2003, 132: 1892-1900. 10.1104/pp.103.021147.PubMedPubMed CentralView ArticleGoogle Scholar
- Denecke J, De RR, Botterman J: Plant and mammalian sorting signals for protein retention in the endoplasmic reticulum contain a conserved epitope. EMBO J. 1992, 11: 2345-2355.PubMedPubMed CentralGoogle Scholar
- Becker C, Senyuk VI, Shutov AD, Nong VH, Fischer J, Horstmann C, Muntz K: Proteinase A, a storage-globulin-degrading endopeptidase of vetch (Vicia sativa L.) seeds, is not involved in early steps of storage-protein mobilization. Eur J Biochem. 1997, 248: 304-312. 10.1111/j.1432-1033.1997.00304.x.PubMedView ArticleGoogle Scholar
- Ludwig E, Lipke U, Raczek U, Jager A: Investigations of peptides and proteases in green coffee beans. Eur Food Res Technol. 2000, 211: 111-116. 10.1007/PL00005518.View ArticleGoogle Scholar