Alignment of C. canephora CP1 and CP4 sequences with highly homologous plant sequences. The alignments were done using CLUSTAL W. The key conserved amino acid characteristics and motifs are noted. Amino acids shaded in grey indicate most conserved sequences. Panel A: Database accession numbers are Coffea canephora CP1 (GeneBank sequence #AEQ54770), Arabidopsis thaliana AtCP (AAL49820) and Vicia sativa VsCPR4 (CAB16316). The catalytic triad Cys, His, and Asn and also the Gln active site residues are indicated by an asterisk. The GCXGG motif is double underlined. The cathepsin propeptide inhibitor domain (I29) is shown in a rectangular box. Note: an ERFNIN-like sequence, shown in italics, exists in the propeptide region of CP1 (ie. ERFNAQ). Panel B: Database accession numbers: Coffea canephora CP4-KDDL (GeneBank sequence #AEQ54771), Nicotiana tabacum NtCP56-KDEL (ACB70409) and Solanum lycopersicum SlCysEP-KDEL (ABV22590). Symbols for key amino acids and motifs are as above, plus the ERFNIN motif is indicated in italics and the KDEL (K358-L361 for NtCP56) motif is shown in a double lined rectangular box. Arrows indicates the site of auto-hydrolysis of NtCP56 reported by Zhang et al. (2009) .