- Research article
- Open Access
Protease gene families in Populus and Arabidopsis
© García-Lorenzo et al; licensee BioMed Central Ltd. 2006
- Received: 14 June 2006
- Accepted: 20 December 2006
- Published: 20 December 2006
Proteases play key roles in plants, maintaining strict protein quality control and degrading specific sets of proteins in response to diverse environmental and developmental stimuli. Similarities and differences between the proteases expressed in different species may give valuable insights into their physiological roles and evolution.
We have performed a comparative analysis of protease genes in the two sequenced dicot genomes, Arabidopsis thaliana and Populus trichocarpa by using genes coding for proteases in the MEROPS database  for Arabidopsis to identify homologous sequences in Populus. A multigene-based phylogenetic analysis was performed. Most protease families were found to be larger in Populus than in Arabidopsis, reflecting recent genome duplication. Detailed studies on e.g. the DegP, Clp, FtsH, Lon, rhomboid and papain-Like protease families showed the pattern of gene family expansion and gene loss was complex. We finally show that different Populus tissues express unique suites of protease genes and that the mRNA levels of different classes of proteases change along a developmental gradient.
Recent gene family expansion and contractions have made the Arabidopsis and Populus complements of proteases different and this, together with expression patterns, gives indications about the roles of the individual gene products or groups of proteases.
- Leaf Senescence
- Tension Wood
- Protease Gene
- Protease Family
- Putative Protease
Proteolysis is a poorly understood aspect of plant molecular biology. Although proteases play crucial roles in many important processes in plant cells, e.g. responses to changes in environmental conditions, senescence and cell death, very little information is available on the substrate specificity and physiological roles of the various plant proteases. Even for the most abundant plant protein, ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco), neither the proteases involved in its degradation nor the cellular location of the process are known. In the Arabidopsis thaliana (hereafter Arabidopsis) genome, many genes with sequence similarities to known proteases have been identified; the MEROPS database (release 7.30) of Arabidopsis proteases contains 676 entries, corresponding to almost 3 % of the proteome. However, protease activity has only been demonstrated for a few of the entries. Most of these putative proteases are found in extended gene families and are likely to have overlapping functions, complicating attempts to dissect the roles of the different proteases in plant metabolism and development.
One scenario in which proteases play a very important role is senescence, although it still is discussed if they actually cause senescence or purely are involved in resource mobilization.
Senescence is the final stage of plant development and can be induced by a number of both external and internal factors such as age, prolonged darkness, plant hormones, biotic or abiotic stress and seasonal responses. An important function of senescence is to reallocate nutrients, nitrogen in particular, to other parts of the plant before the specific structure is degraded. The understanding of senescence is very important for biomass production. In order to understand more about the role of proteases during senescence in this study we compare the nuclear genome of Arabidopsis thaliana and Populus trichocarpa. The close relationship of these two species in the plant kingdom  allows a direct comparison of an annual plant with a tree that has to cope with highly variable adaptations during its long life span. Recent research has shown that leaf senescence affects the chloroplast much earlier than the mitochondria or other compartments of the cell , we therefore chose to focus on protease families that express members in this plastid as well as on the papain protease family which consists of proteases that are well-known to be involved in senescence.
In the chloroplast at least 11 different protease families are represented, however, several of them work as processing peptidases. Only 6 families posses members that are known to be involved in degradation, four of these families belong to the class of serine proteases, two are metalloproteases. The Deg proteases form one family (S1, chymotrypsin family) inside the serine clade and the ATP-dependent Clp proteases are grouped in the S14 family. The S16 family contains the so-called Lon proteases. Metalloproteases (MPs) are proteases with a divalent cation cofactor that binds to the active site; most commonly Zn2+ is ligated to two Histidines in the sequence HEXXH. However, Zn2+ can be replaced by Co2+, Mn2+ or even Mg2+. The M41 family is the group of FtsH proteases and the EGY (ethylene-dependent gravitropism-deficient and yellow-green) proteases belong to the family of S2P proteases (M50).
Comparative genomics analyses could provide valuable insights into the conservation, evolution, abundance and roles of the various plant protease families. For instance, such analyses should facilitate the detection of protein sequences that are conserved in different species, and thus are likely to have common functions in them, and recent expansions of gene families, which should help elucidate issues concerning non-functionalization, neofunctionalization and subfunctionalization. Thus, as reported here, we undertook a comparative analysis of protease gene families in the two sequenced dicot genomes, those of the annual plant Arabidopsis and the tree Populus trichocarpa (hereafter Populus), with special emphasis on proteases which may play a role in senescence. The results should help to provide a framework for further elucidation of the nature and roles of these complex gene families.
Most protease gene families are larger in Populus than in Arabidopsis
We made an analysis of all protease genes of Arabidopsis and Populus. As noted above, conservation of a protein sequence in these two species indicates that it is likely to have a common function in them. Recent expansions of gene families, on the other hand, could provide indications of different adaptive requirements (and, possibly, of more general differences between annual plants and trees).
Comparison of numbers of protease genes in Arabidopsis and Populus. Families highlighted in bold are those that have been examined in most depth in this study.
Number of Genes in Arabidopsis
Number of Genes in Populus
Peptidase family T2
ubiquitin C-terminal hydrolase family
pyroglutamyl peptidase I family
ubiquitin-specific protease family
gamma-glutamyl hydrolase family
Peptidase family C44
Aut2 peptidase family
PfpI endopeptidase family
Peptidase family C65
Chymotrypsin family (Deg)
Prolyl oligopeptidase family
Peptidase family S10
D-Ala-D-Ala carboxypeptidase B family
ClpP endopeptidase family
Lon protease family
Signal peptidase I family
Peptidase family S28
Peptidase family S33
C-terminal processing peptidase family
protease IV family (SppA)
Peptidase family S59
Peptidase family M1
Peptidase family M3
Peptidase family M10
carboxypeptidase A family
leucyl aminopeptidase family
Peptidase family M20
Peptidase family M22
Peptidase family M24
Aminopeptidase Y family
Beta-aspartyl dipeptidase family
FtsH endopeptidase family
Ste24 endopeptidase family
S2P protease family
Peptidase family M67
Copia transposon endopeptidase family
To confirm the findings described above, case studies were performed in more detail, focusing on proteases that are known to be present in the plant plastids and mitochondria, partly because we have a special interest in organellar biology and partly because these proteases generally belong to the best characterized plant protease families. The "organellar protease subfamilies" chosen for detailed comparisons were: the Deg/HtrA family (chymotrypsin family, S1), Lon protease family (S16), rhomboid protease family (S54) and the Clp endopeptidase family (S14), all belonging to the serine-type class, and the metallo-type FtsH endopeptidase family (M41). In addition, we examined the papain-like cysteine protease family (C1) as certain members are known to play an important role in leaf development, being the necessary machinery that the leaf needs to respond to different kind of stresses or to undergo senescence.
The FtsH protease family
FtsHs are ATP-dependent proteases that based on the X-ray crystallographic analysis form a homo-oligomeric hexameric ring . E. coli FtsH has two transmembrane domains towards the N-terminus that anchor it in the plasma membrane, while the protease domain and the C-terminus face the cytoplasm . Four isomers of FtsH have been identified in Synechocystis sp. PCC 6803, 12 in Arabidopsis . Of the nine FtsH that reside in the chloroplast, five have been shown to be involved in the degradation of photosynthetic proteins during light acclimation [12, 13] or after high light damage [14–17].
In Arabidopsis the FtsH family is encoded by 16 homologous sequences . Four of these sequences lack the Zn-binding motif and are therefore thought to have lost proteolytic activity. However, they might be involved in chaperone functions instead . In this work we focused on these presumably active proteases. FtsH proteases are thought to be membrane integral, as has been shown experimentally for FtsH1. This protease is inserted into the thylakoid membrane with the Zn-binding and ATPase motifs facing the stroma . Gene comparison studies showed that of the 12 ftsH genes potentially coding for fully functional proteases 10 are found in highly homologous pairs. While the pairs AtFtsH1/5, AtFtsH2/8 and AtFtsH 7/9 are targeted to the chloroplast, AtFtsH3/10 and AtFtsH4 have been identified in mitochondria [18, 19]. AtFtsH11, which contains only one transmembrane domain was recently suggested to be located in both chloroplasts and mitochondria [19, 20]. AtFtsH12 and AtFtsH6, both localized in the chloroplast [12, 21] have no pair-partners. The proteins in a pair very likely work in concert, and have overlapping functions as shown for FtsH1/5 and FtsH2/8 . These pairs of proteases are the most strongly expressed FtsHs in plants. Deletion mutants of these genes lead to a variegated leaf type, therefore the names Var1 and Var2 were given to them (reviewed by Sakamoto et al. ). The only FtsH protein for which a function has been established, apart from these four proteases, is FtsH6 .
Arabidopsis (At) and Populus (Pt) FtsH protease gene models (M41 family in MEROPS) corresponding to the names given in the FtsH phylogenetic tree.
Populus Gene model
The Var2 group, represented by AtFtsH2 and AtFtsH8 in Arabidopsis, has the most Populus representatives (PtFtsH2.1, PtFtsH2.2 PtFtsH2.3, PtFtsH2.4 and PtFtsH2.5); all of which are very closely related and appear to have originated from a recent gene family expansion. The Var1 group comprises AtFtsH1, AtFtsH5, PtFtsH1.1 and PtFtsH1.2. A more distant relative of this group is PtFtsH1.3, which has no close Arabidopsis homologue. AtFtsH6 and its Populus ortholog, PtFtsH6, are closely related to the Var1/Var2 groups, and clearly separated from the FtsH4/11, FtsH3/10, FtsH7/9 and FtsH12 groups. Interestingly, while in the pairs FtsH1 and 5, FtsH2 and 8, FtsH3 and 10 and FtsH7 and 9 the duplication of the genes seem to have occurred after the separation of Populus and Arabidopsis, in the pair FtsH4 and FtsH11 the Arabidopsis proteases have at least one distinct orthologue in Populus. Here subfunctionalization seems to have occurred, evident by the fact that AtFtsH4 is found in mitochondria, while AtFtsH11 also can be located in the chloroplast [19, 20].
Some Deg subfamilies are more expanded in Arabidopsis
The Deg proteases form the first family (S1, chymotrypsin family) inside the serine clade. DegP (or HtrA for high temperature requirement) was the first Deg protease identified in E. coli . As determined from its crystal structure it functions as homotrimeric oligomer , the catalytic center consisting of the residues His-Asp-Ser typical for most serine proteases (SPs). HtrA also functions as a chaperone at low temperature . While cyanobacteria – like E. coli – posses 3 members of this family, in the Arabidopsis genome 16 homologues were found. Deg1, 2, 5 and 8 have been identified in the chloroplast [26, 27]. In plants and cyanobacteria the Deg proteases are thought to be involved in cell growth, stress responses, PCD and senescence [28, 29].
The Deg protease family in Arabidopsis consists of 16 proteins that are localized in different cellular compartments and in many cases have unknown functions. AtDeg1, AtDeg2, AtDeg5 and AtDeg8 are the plastidic members of the AtDeg group. AtDeg1, AtDeg5 and AtDeg8 have been localized in the thylakoid lumen of the plant chloroplast [26, 30, 31]. AtDeg2 has been identified at the stromal side of the thylakoid membrane and seems, at least in higher plants, to be responsible for the degradation of the reaction center D1 protein of Photosystem II (PSII) .
Arabidopsis (At) and Populus (Pt) Deg protease gene models (S1 family in MEROPS) corresponding to the names given in the Deg phylogenetic tree.
Populus Gene model
The Deg17 group consists exclusively of Populus sequences. These genes code for three proteases that are not closely related to any Arabidopsis protein, but clearly belong to the chymotrypsin family and have a Deg structure, perhaps representing a subfamily that was lost during Arabidopsis evolution (Figure 3).
The Clp family
Clp proteases are multi-subunit enzymes in which the catalytic domain and the ATPase domain are split in different subunits. Structurally they are very similar to the proteasome 26S in eukaryotes ; suggesting that these ATP-dependent proteases are evolutionary related. Proteins in the plant Clp family, consisting of chaperones and proteases involved in the degradation of misfolded proteins , have been grouped in two different subclasses . The proteolytically active protease is designated ClpP, but there are also many genes coding for similar proteins lacking the Ser and His amino acid residues of the catalytic triad, and thus representing an inactive form, named ClpR, with unknown function. The regulating subunits work as chaperones that unfold the targeted proteins for degradation, but may also be involved in protein folding independent of proteolysis. Class I chaperones contain two ATP-binding sites like the ClpCs and ClpBs, while the class II chaperones contain only one ATP binding site, like ClpD, ClpF and ClpXs [11, 36]. Crystallisation studies  have shown that the protease unit, ClpP, forms a tetradecameric barrel-like structure. On one or both ends complexes of ATPase subunits, in E. coli either ClpA or ClpX, form homo-hexameric rings. In the absence of ClpP these units can act as chaperones. In chloroplasts, homologues of ClpB and ClpC, but not ClpA form a complex with ClpP . Chloroplast genomes of alga and higher plants contain a gene potentially encoding ClpP and only recently ClpP was also discovered in the nuclear genome .
Arabidopsis (At) and Populus (Pt) Clp protease gene models (S14 family in MEROPS) corresponding to the names given in the Clp phylogenetic tree.
Populus Gene model
The ClpR2 sequences from Arabidopis and Populus are most similar to the ClpP1 proteins, probably representing a successful case of horizontal gene transfer from the chloroplast to the nucleus that happened before the split of the lineages leading to Arabidopsis and Populus. AtClpP1 is encoded in the chloroplast. We found five homologous sequences in the Populus nuclear genome, illustrating the flux of genetic material from the chloroplast to the nuclear genome. However, we did not find signs of expression (i.e. associated ESTs) for any of these putative genes, and some of them also appeared not to code for full-length proteins, suggesting that they represent non-functional DNA inserted into the nuclear genome, therefore they will not be further considered here. AtClpP2 has four Populus homologs, most of the remaining catalytic AtClp proteins have two or more orthologs in Populus, but ClpP3, ClpR2 and ClpR4 each have only one.
The lower part of the MPT in Fig. 4 shows the relationships of the regulatory subunits. Ten well-supported subgroups can be identified: the ClpC3, ClpS, ClpD, ClpC1/C2, ClpF, ClpT, ClpX groups, two ClpB groups, and the ClpN57710 group, containing one Arabidopsis and three Populus genes. The separation of the ClpB1-4, ClpC, ClpD and ClpF branches is well supported, with ClpC and ClpF being more closely related to each other than to the other members. The main difference between the ClpD and ClpC groups is that they have specific signature sequences, but they have also been shown to have different expression profiles, ClpDs being specifically expressed in dehydration and senescence [40, 41]. The presence of two different ClpB groups is an interesting feature, which can be explained by the fact that At1g07200 (AtClpB5) is grouped by TAIR as a ClpB-related protein. As the nomenclature for ClpB1-4 has already been established, we decided to name this Arabidopsis/Populus class ClpB5.
Similar to the situation in the other protease families, many Arabidopsis Clp genes have two close homologs in Populus, but the ClpD and ClpB5 families are more heavily extended in Populus, both having five Populus genes compared to a single Arabidopsis gene. There are two ClpC members in each organism. However, both of the Populus ClpCs seem to be more closely related to AtClpC1 than to AtClpC2. The ClpX group is predicted to be localized in the mitochondrial matrix in Arabidopsis  and it is formed by three proteases in each organism. AtClpX2 seems to have a clear ortholog in Populus, while the other two Populus Cl/pX proteases are more closely related to AtClpX1.
Lon proteases (S16 family) are responsible for the degradation of abnormal, damaged and unstable proteins. They have no membrane-spanning domain and contain the AAA (ATPases associated with various cellular activities) and protease domains in one polypeptide. Instead of the Ser-His-Asp of "classical" serine proteases, in Lon proteases the catalytic site is suggested to be formed by a Ser-Lys dyad [45–47]. A crystal structure of Lon in E. coli was determined recently and shown to form a hexameric ring . Lon proteases have been described as mitochondrial proteases. However, recent studies have predicted their presence in chloroplasts and peroxisomes [41, 48] and Lon4 was shown to be targeted to both chloroplasts and mitochondria .
Arabidopsis (At) and Populus (Pt) Lon protease gene models (S16 family in MEROPS) corresponding to the names given in the Lon phylogenetic tree.
Populus Gene model
The rhomboid family (S54) is a relatively poorly investigated family. It has been widely detected in bacteria, archaea and, recently, eukaryotic organisms – initially in Drosophila melangolaster [49, 50], then plants . Rhomboid proteases are membrane proteins with six or seven transmembrane domains that cleave their substrates within the substrate's transmembrane domain. This so-called regulated intramembrane proteolysis (RIP) has been shown to be very important for signal transduction. In recent studies of Arabidopsis rhomboids a catalytic dyad has been suggested to be the active site, formed by Ser-His residues [51, 52]. The overall structure and sequence of the rhomboid proteases, widely conserved throughout all kingdoms, is very different from that of the other serine proteases, suggesting that they have become serine proteases by convergent evolution . Today, 15 members are annotated in Arabidopsis. Another Arabidopsis gene (At5g25640) has high sequence homology to this family, but it is predicted to code for a protein with only two membrane-spanning helices and therefore was not considered in this study. Two rhomboids (AtRbl1 and 2) have been shown to be localized in the Golgi apparatus , the subcellular localization of most of the others is predicted to be in mitochondria. Only AtRbl9 and 10 were predicted to be located in the chloroplast using the programs TargetP and Predator. However, the Meta Analysis of the Arabidopsis rhomboid genes in Genevestigator  suggests that some of them may play important roles in leaf development and senescence.
Arabidopsis (At) and Populus (Pt) rhomboid protease gene models (S54 family in MEROPS) corresponding to the names given in the Lon phylogenetic tree.
Populus Gene model
The EGY proteases belong to the family of S2P proteases (M50), which are ATP-independent metallo-proteases. EGY1 has been recently characterized  as a required protease for chloroplast development. With 8 putative transmembrane domains and the intramembrane Zn2+-binding domain, these proteases might have a similar structure and function as the rhomboids , even though they belong to the class of metalloproteases. The Arabidopsis genome possesses 3 EGYs, EGY1, having been identified in the chloroplast, has one possible orthologue in Populus, EGY2 shows homology to one closer and one more distant relative in Populus. EGY3 possesses less homology to the other two Arabidopsis proteases and also has one orthologue in Populus (not shown).
In animals, the most representative family of this group is the group of caspases (Cys-Asp-specific proteases, family C14), which play an important role in programmed cell death (PCD) and hypersensitive response (HR) controlling the so-called apoptosis cascade. Closely related proteases in plants are the metacaspases (C14), which have been found to be involved in HR and to act through a caspase-like mechanism .
Populus gene models whose ESTs are specific to a unique library and comparative numbers of the corresponding genes in Arabidopsis. Libraries: (I) senescing leaves, (F) flower buds, (T) shoot meristem, (V) male catkins, (AB) cambial zone, (UB) active cambium, (G) tension wood, (X) wood cell death.
Number of Genes in family of Arabidopsis
Number of Genes in family of Populus
Proteasome subunit beta type 2-2
RD21 Papain-Like cysteine protease
Proteasome subunit beta type 2-2
RD21 Papain-Like cysteine protease
20S proteasome alpha subunit F
Proteasome subunit alpha type 6-1
20S proteasome beta subunit.
similar to SAG12
Metallopeptidase M24 family protein
subtilase family protein
Proteasome subunit beta type 3-2
Different Populustissues express unique repertoires of proteases
The extensive Populus EST resource compiled in PopulusDB  allows indications of the expression patterns of Populus genes to be rapidly obtained. Of the 951 genes classified above as putative proteases 382 had associated ESTs in PopulusDB, suggesting that these genes, at least, are expressed. Since there are correlations, albeit imperfect, between the abundance of ESTs and the levels of corresponding mRNAs and proteins in particular tissues we wanted to identify the tissues/treatments in which the mRNAs of different types of proteases are most strongly represented. To see if other proteases show similar specificity we examined their digital expression profiles, applying two criteria to reduce the numbers of false positives due to limited information (i.e. the presence of low numbers of ESTs) (table 7). These criteria were (i) more than four ESTs had to be associated with the candidate gene and (ii) more than twice as many ESTs had to be detected in one library than in any other. Only nineteen genes fulfilled these criteria for specific expression. Interestingly, members of the Deg-, FtsH- and papain-like proteases were all highly expressed in senescing leaf tissue. In addition to proteases with particularly high EST frequencies in the senescing leaf and wood cell death libraries, we identified proteases that appeared to be highly expressed in flower buds (four), male catkins (two), the cambial zone (two) and the shoot apical meristem, tension wood, roots and dormant cambium (one in each case). Tissue-specific expression may be the result of a subfunctionalization process, stabilizing both copies of a duplicated gene. To assess the likelihood that such a process has occurred in Populus, we sought evidence indicating that unusually high numbers of these genes have undergone recent duplications. We found that the overwhelming majority of the gene families appear to have expanded recently, from one copy in Arabidopsis to two or three copies in Populus. This is consistent with the hypothesis that subfunctionalization is one of the forces that has maintained the high proportion of duplicated genes in Populus.
Patterns of protease gene expression during Populus leaf development
The genes in cluster 1 are the truly senescence-associated genes. Their mRNA levels did not notably increase until September, but their expression then continued to increase in successive samples, including the last sample from which RNA could be prepared, collected on September 21. This expression pattern was exhibited by genes encoding protease classes C1 (2 genes), C13, C19, M41, M48, S14, S33 (three genes each) and T2 (two genes), i.e. a number of the classes with previously indicated roles during leaf senescence (such as papain-like proteases and FtsH). Cluster 2 had a similar pattern, but the changes were less pronounced, so these genes were only moderately induced during leaf senescence. This cluster contained genes from classes C1, M16, M50, S1, S9 and S14. Cluster 3 consisted of genes that had a fairly stable expression throughout the growing season, but with low mRNA levels during both bud burst and leaf senescence. Pattern 4 was only represented by a S8 (subtilisin) protease gene, which had a pronounced peak during the cell wall biosynthesis phase in the leaf and decreased to low levels in older leaves. Cluster 5 genes were mainly expressed during the first two weeks of leaf development (during the phases mainly characterized by cell division and cell expansion) whereas cluster 6 genes showed the opposite pattern, i.e. they were much more strongly expressed after, rather than during the first two weeks. Cluster 6 was a major cluster, including four genes in the C1 class, seven in the S14 (Clp) class, two in the M1 class, and four other classes. Almost half of the genes coding for proteins in the Clp family appeared to be specifically down regulated when the leaf expanded, suggesting that they have no important function in this stage of leaf development. Clusters 7, 8 and 9 all contain proteases of many different classes, and all showed essentially constitutive expression patterns, except that cluster 7 had lower mRNA levels in the middle of the summer. Clusters 10 and 11, containing mainly serine proteases, both showed high mRNA levels in the first week of leaf development, but cluster 10 seemed to be induced later in the season. Almost all proteasome subunits exhibited expression pattern 11, indicating that the proteasome is most important at the very first stages of aspen leaf development from winter buds. Finally, cluster 12 showed high expression levels only in very young leaves and during late stages of senescence. Taken together, these data indicate that there are several "waves" of protease gene expression during leaf development; consistent with the idea that proteases are important during all stages of the lifecycle of the leaf.
We here present a comparative analysis of the gene families coding for putative proteases of Arabidopsis and Populus. The patterns for the copy numbers of most families and subfamilies were quite consistent – the Populus families were generally larger, as an apparent result of the fairly recent genome duplication [4, 5]. Some families were considerably more heavily represented in Populus, but a few were more abundant in Arabidopsis. It seems reasonable to expect, for example, a tree like Populus to show relatively strong retention of families like RD21 and SAG12, which are involved in the response to dehydration and leaf senescence, respectively – traits that would intuitively require more elaborate regulation in a tree than in an annual plant, but surprisingly the RD21 family was one of the few gene families that was larger in Arabidopsis than in Populus. This supports the view that a considerable element of chance has influenced the size of the gene families in Populus, and that stochastic events as well as subfunctionalization and neofunctionalization are important determinants of whether genes are lost or retained in a duplicated genome. Therefore, in most cases, the presence of higher numbers of genes in one plant species than in another cannot be explained simply by their adaptive "needs". However, subfunctionalization and neofunctionalization should not be neglected – in fact, we have shown that they have affected the evolution of the Populus genome , and our analysis of genes with tissue-specific expression patterns supports this notion.
Unfortunately, of the 723 and 955 proteases identified in Arabidopsis and Populus, respectively, the function(s), localization and substrate(s) of most of the proteases remain enigmatic. The Var1/Var2/FtsH6 proteases comprise one of the few protease groups for which mutant phenotypes in Arabidopsis have been carefully examined, and placed in a phylogenetic perspective . Their function in photoprotection seems to have evolved at a very early stage, in the cyanobacterial progenitors of modern cyanobacteria, algae and plants . Later, the Var1 and Var2 functions appear to have separated, and there seems to be an overlap in the substrate specificity of the proteases and the phenotypes of the mutants. Var1 and var2 are more sensitive than wild type to PSII photoinhibition [15, 16]. This duplication of the genes appears to have happened after the separation of Arabidopsis and Populus (see Fig. 2). However, in the lineage leading to higher plants, within this group the FtsH6 evolved through neo-functionalization; this protease degrades the antenna rather than reaction center proteins. A clear ortholog of AtFtsH6 can also be found in Populus. Based on this very limited information we raise the following hypothesis. If there is a one-to-one relationship between the Populus and Arabidopsis sequences, we assume that these genes are functional orthologs, i.e. they degrade the same substrate(s) under the same conditions. However, if the gene duplication happened after the split between Arabidopsis and Populus lineages, no neofunctionalization has probably occurred yet, so the functions of these proteases are overlapping. Experiments to verify this hypothesis are in progress.
Our analysis shows that different tissues express fairly unique sets of genes putatively coding for proteases. Furthermore, in the developmental gradient from bud burst to leaf senescence different waves of protease gene expression occur. However, expression analysis does not always give clear evidence of function. For example, AtFtsH6 has been shown to degrade LHCII only during high light acclimation and senescence ; although this protease is essentially constitutively expressed in leaves, its proteolytic activity is regulated by the availability of the substrate. Forward or reverse genetics will be needed to obtain clear information on the involvement of various proteases in different biological processes. However, in order to make reverse genetics efficient, comparative genomics data, such as those presented in this paper, facilitate selection of the best candidates. A simple comparative analysis can provide explanations for experimental data. Since the AtFtsH1/FtsH5 and AtFtsH2/FtsH8 pairs have separated after the split of lineages leading to Populus and Arabidopsis, it is not surprising that the pairs will have overlapping and partially redundant functions . This means that mutant analysis, either by forward or reverse genetics, will not always provide clear answers; in many cases, biochemical analysis of protease substrate specificities will probably be needed to assign functions to the individual members of the large protease gene families.
In summary, we have identified 951 genes in the Populus genome potentially coding for proteases and comparatively analyzed the protease composition of Populus and Arabidopsis.
The databases searched for annotated proteases were TAIR (The Arabidopsis Information Resource) and TrEMBL (a Computer-annotated supplement to Swiss-Prot). The data were grouped according to the MEROPS protease database families.
In addition, a blastp search was used to collect the Populus gene models that were not clustered with any of the Arabidopsis genes. To confirm that these new gene models from Populus corresponded to protease genes, a protease-motif search was made in SMART 4.0  and InterProScan . Protein sequences that did not have a typically protease family motif were discarded.
Protein alignment and Phylogenetic trees
Protein alignment was performed with ClustalX 1.81 . Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 2.1 . The FtsH, Deg, Lon and rhomboid trees were derived using an Unweighted Pair Group Method with Arithmetic Mean (UPGMA) method with 1000 bootstraps. The trees for the Clp and papain-like proteases are Maximum parsimony trees (MPT) with 1000 bootstraps.
All families were analysed with both algorithms, and with several different gap penalties. The choice of trees to display was driven by a desire to keep known or suspected orthologous gene clusters in the same branch of the tree, and to produce figures with size and shape suitable for printing. Trees produced with other algorithms and settings are available on request.
The Arabidopsis nomenclature used in this article follows that proposed by Adam et.al.  and further developed by Sokolenko et.al. . As in this nomenclature, protein names were given for Populus proteases according to their clustering or proximity in the tree, allowing an intuitive association between the Populus proteins and the closest Arabidopsis proteins. We have organized the proteins into groups based on their sequence homology in order to facilitate the new nomenclature proposed for Populus proteases.
For the rhomboid proteases in Arabidopsis, we followed the nomenclature initiated by Kanaoka et.al. , naming the closest to DmRho-1 (the first rhomboid protease described from Drosophila melanogaster) AtRbl1. Since the previously named AtKOM is the 8th member of the family in Kanaoka's article we continued at AtRbl9; higher numbers indicate increasingly distant relationships to DmRho-1.
Digital expression profiles were obtained from PopulusDB , and analysed in UPSC-BASE  . The similarity between gene models (rows) or cDNA library (columns) expression profiles was estimated according to Ewing et.al.  with some modifications. Briefly, similarity between gene models or cDNA library expression profiles was estimated by Pearson's coefficient. From the gene model correlations a pairwise Manhattan distance matrix was calculated and the dendrogram was created with the average agglomeration method. The order of gene models and libraries in their respective dendrograms were used to reorder the original data table. All calculations and plotting were done in the programme language R . 
DNA microarray data from Andersson et.al.  and Sjödin et al. (submitted) were merged and processed in UPSC-BASE according to the default analysis pipeline  . The normalised data were hierarchical clustered with Euclidean distance and average linkage in the TIGR MultiExperiment Viewer (MeV)  . The dataset were divided into 12 clusters (see Additional file 1) and the average log ratio for each cluster was plotted.
Financial sources: The Swedish Foundation for Strategic Research, the Swedish Research Council and the Carl Tryggers Foundation
- Rawlings ND, Morton FR, Barrett AJ: MEROPS: the peptidase database. Nucleic Acids Res. 2006, 34: D270-2. 10.1093/nar/gkj089.PubMedPubMed CentralView ArticleGoogle Scholar
- Brunner AM, Busov VB, Strauss SH: Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends Plant Sci. 2004, 9: 49-56. 10.1016/j.tplants.2003.11.006.PubMedView ArticleGoogle Scholar
- Hortensteiner S, Feller U: Nitrogen metabolism and remobilization during senescence. J Exp Bot J Exp Bot. 2002, 53: 927-937. 10.1093/jexbot/53.370.927.PubMedView ArticleGoogle Scholar
- Sterck L, Rombauts S, Jansson S, Sterky F, Rouze P, Van de Peer Y: EST data suggest that poplar is an ancient polyploid. New Phytol. 2005, 167: 165-170. 10.1111/j.1469-8137.2005.01378.x.PubMedView ArticleGoogle Scholar
- Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313: 1596-604.PubMedView ArticleGoogle Scholar
- Lescot M, Rombauts S, Zhang J, Aubourg S, Mathe C, Jansson S, Rouze P, Boerjan W: Annotation of a 95-kb Populus deltoides genomic sequence reveals a disease resistance gene cluster and novel class I and class II transposable elements. Theor Appl Genet. 2004, 109: 10-22. 10.1007/s00122-004-1621-0.PubMedView ArticleGoogle Scholar
- Kurepa J, Walker JM, Smalle J, Gosink MM, Davis SJ, Durham TL, Sung DY, Vierstra RD: The small ubiquitin-like modifier (SUMO) protein modification system in Arabidopsis. Accumulation of SUMO1 and -2 conjugates is increased by stress. J Biol Chem. 2003, 278: 6862-6872. 10.1074/jbc.M209694200.PubMedView ArticleGoogle Scholar
- Murtas G, Reeves PH, Fu YF, Bancroft I, Dean C, Coupland G: A nuclear protease required for flowering-time regulation in Arabidopsis reduces the abundance of SMALL UBIQUITIN-RELATED MODIFIER conjugates. Plant Cell. 2003, 15: 2308-2319. 10.1105/tpc.015487.PubMedPubMed CentralView ArticleGoogle Scholar
- Krzywda S, Brzozowski AM, Verma C, Karata K, Ogura T, Wilkinson AJ: The crystal structure of the AAA domain of the ATP-dependent protease FtsH of Escherichia coli at 1.5 A resolution. Structure. 2002, 10: 1073-1083. 10.1016/S0969-2126(02)00806-7.PubMedView ArticleGoogle Scholar
- Ito K, Akiyama Y: Cellular functions, mechanism of action, and regulation of FtsH protease. Annu Rev Microbiol. 2005, 59: 211-231. 10.1146/annurev.micro.59.030804.121316.PubMedView ArticleGoogle Scholar
- Sokolenko A, Pojidaeva E, Zinchenko V, Panichkin V, Glaser VM, Herrmann RG, Shestakov SV: The gene complement for proteolysis in the cyanobacterium Synechocystis sp. PCC 6803 and Arabidopsis thaliana chloroplasts. Curr Genet. 2002, 41: 291-310. 10.1007/s00294-002-0309-8.PubMedView ArticleGoogle Scholar
- Ostersetzer O, Adam Z: Light-stimulated degradation of an unassembled Rieske FeS protein by a thylakoid-bound protease: the possible role of the FtsH protease. Plant Cell. 1997, 9: 957-965. 10.1105/tpc.9.6.957.PubMedPubMed CentralView ArticleGoogle Scholar
- Zelisko A, Garcia-Lorenzo M, Jackowski G, Jansson S, Funk C: AtFtsH6 is involved in the degradation of the light-harvesting complex II during high-light acclimation and senescence. Proc Natl Acad Sci U S A. 2005, 102: 13699-13704. 10.1073/pnas.0503472102.PubMedPubMed CentralView ArticleGoogle Scholar
- Lindahl M, Spetea C, Hundal T, Oppenheim AB, Adam Z, Andersson B: The thylakoid FtsH protease plays a role in the light-induced turnover of the photosystem II D1 protein. Plant Cell. 2000, 12: 419-431. 10.1105/tpc.12.3.419.PubMedPubMed CentralView ArticleGoogle Scholar
- Bailey S, Thompson E, Nixon PJ, Horton P, Mullineaux CW, Robinson C, Mann NH: A critical role for the Var2 FtsH homologue of Arabidopsis thaliana in the photosystem II repair cycle in vivo. J Biol Chem. 2002, 277: 2006-2011. 10.1074/jbc.M105878200.PubMedView ArticleGoogle Scholar
- Sakamoto W, Tamura T, Hanba-Tomita Y, Murata M: The VAR1 locus of Arabidopsis encodes a chloroplastic FtsH and is responsible for leaf variegation in the mutant alleles. Genes to Cells. 2002, 7: 769-780. 10.1046/j.1365-2443.2002.00558.x.PubMedView ArticleGoogle Scholar
- Silva P, Thompson E, Bailey S, Kruse O, Mullineaux CW, Robinson C, Mann NH, Nixon PJ: FtsH is involved in the early stages of repair of photosystem II in Synechocystis sp PCC 6803. Plant Cell. 2003, 15: 2152-2164. 10.1105/tpc.012609.PubMedPubMed CentralView ArticleGoogle Scholar
- Leonhard K, Herrmann JM, Stuart RA, Mannhaupt G, Neupert W, Langer T: AAA proteases with catalytic sites on opposite membrane surfaces comprise a proteolytic system for the ATP-dependent degradation of inner membrane proteins in mitochondria. Embo J. 1996, 15: 4218-4229.PubMedPubMed CentralGoogle Scholar
- Heazlewood JL, Tonti-Filippini JS, Gout AM, Day DA, Whelan J, Millar AH: Experimental analysis of the Arabidopsis mitochondrial proteome highlights signaling and regulatory components, provides assessment of targeting prediction programs, and indicates plant-specific mitochondrial proteins. Plant Cell. 2004, 16: 241-256. 10.1105/tpc.016055.PubMedPubMed CentralView ArticleGoogle Scholar
- Urantowka A, Knorpp C, Olczak T, Kolodziejczak M, Janska H: Plant mitochondria contain at least two i-AAA-like complexes. Plant Mol Biol. 2005, 59: 239-252. 10.1007/s11103-005-8766-3.PubMedView ArticleGoogle Scholar
- Sakamoto W, Zaltsman A, Adam Z, Takahashi Y: Coordinated regulation and complex formation of yellow variegated1 and yellow variegated2, chloroplastic FtsH metalloproteases involved in the repair cycle of photosystem II in Arabidopsis thylakoid membranes. Plant Cell. 2003, 15: 2843-2855. 10.1105/tpc.017319.PubMedPubMed CentralView ArticleGoogle Scholar
- Zaltsman A, Ori N, Adam Z: Two Types of FtsH Protease Subunits Are Required for Chloroplast Biogenesis and Photosystem II Repair in Arabidopsis. Plant Cell. 2005, 17: 2782-2790. 10.1105/tpc.105.035071.PubMedPubMed CentralView ArticleGoogle Scholar
- Lipinska B, Fayet O, Baird L, Georgopoulos C: Identification, characterization, and mapping of the Escherichia coli htrA gene, whose product is essential for bacterial growth only at elevated temperatures. J Bacteriol. 1989, 171: 1574-1584.PubMedPubMed CentralGoogle Scholar
- Clausen T, Southan C, Ehrmann M: The HtrA family of proteases: Implications for protein composition and cell fate. Molecular Cell. 2002, 10: 443-455. 10.1016/S1097-2765(02)00658-5.PubMedView ArticleGoogle Scholar
- Spiess C, Beil A, Ehrmann M: A temperature-dependent switch from chaperone to protease in a widely conserved heat shock protein. Cell. 1999, 97: 339-347. 10.1016/S0092-8674(00)80743-6.PubMedView ArticleGoogle Scholar
- Schubert M, Petersson UA, Haas BJ, Funk C, Schroder WP, Kieselbach T: Proteome map of the chloroplast lumen of Arabidopsis thaliana. J Biol Chem. 2002, 277: 8354-8365. 10.1074/jbc.M108575200.PubMedView ArticleGoogle Scholar
- Haussuhl K, Andersson B, Adamska I: A chloroplast DegP2 protease performs the primary cleavage of the photodamaged D1 protein in plant photosystem II. Embo J. 2001, 20: 713-722. 10.1093/emboj/20.4.713.PubMedPubMed CentralView ArticleGoogle Scholar
- Kieselbach T, Funk C: The family of Deg/HtrA proteases: from Escherichia coli to Arabidopsis. Physiologia Plantarum. 2003, 119: 337-346. 10.1034/j.1399-3054.2003.00199.x.View ArticleGoogle Scholar
- Huesgen PF, Schuhmann H, Adamska I: The family of Deg proteases in cyanobacteria and chloroplasts of higher plants. Physiologia Plantarum. 2005, 123: 413-420. 10.1111/j.1399-3054.2005.00458.x.View ArticleGoogle Scholar
- Itzhaki H, Naveh L, Lindahl M, Cook M, Adam Z: Identification and characterization of DegP, a serine protease associated with the luminal side of the thylakoid membrane. J Biol Chem. 1998, 273: 7094-7098. 10.1074/jbc.273.12.7094.PubMedView ArticleGoogle Scholar
- Chassin Y, Kapri-Pardes E, Sinvany G, Arad T, Adam Z: Expression and characterization of the thylakoid lumen protease DegP1 from Arabidopsis. Plant Physiol. 2002, 130: 857-864. 10.1104/pp.007922.PubMedPubMed CentralView ArticleGoogle Scholar
- Schuhman H HPF: Deg15 in Arabidopsis thaliana. FEBS Journal. 2005, 272: B3-046P.Google Scholar
- Horwich AL, Weber-Ban EU, Finley D: Chaperone rings in protein folding and degradation. Proc Natl Acad Sci U S A. 1999, 96: 11033-11040. 10.1073/pnas.96.20.11033.PubMedPubMed CentralView ArticleGoogle Scholar
- Kruger E, Witt E, Ohlmeier S, Hanschke R, Hecker M: The clp proteases of Bacillus subtilis are directly involved in degradation of misfolded proteins. J Bacteriol. 2000, 182: 3259-3265. 10.1128/JB.182.11.3259-3265.2000.PubMedPubMed CentralView ArticleGoogle Scholar
- Porankiewicz J, Wang J, Clarke AK: New insights into the ATP-dependent Clp protease: Escherichia coli and beyond. Mol Microbiol. 1999, 32: 449-458. 10.1046/j.1365-2958.1999.01357.x.PubMedView ArticleGoogle Scholar
- Janska H: ATP-dependent proteases in plant mitochondria: What do we know about them today?. Physiologia Plantarum. 2005, 123: 399-405. 10.1111/j.1399-3054.2004.00439.x.View ArticleGoogle Scholar
- Wang J, Hartling JA, Flanagan JM: The structure of ClpP at 2.3 A resolution suggests a model for ATP-dependent proteolysis. Cell. 1997, 91: 447-456. 10.1016/S0092-8674(00)80431-6.PubMedView ArticleGoogle Scholar
- Clarke AK, MacDonald TM, Sjogren LLE: The ATP-dependent Clp protease in chloroplasts of higher plants. Physiologia Plantarum. 2005, 123: 406-412. 10.1111/j.1399-3054.2005.00452.x.View ArticleGoogle Scholar
- Sokolenko A, Lerbs-Mache S, Altschmied L, Herrmann RG: Clp protease complexes and their diversity in chloroplasts. Planta. 1998, 207: 286-295. 10.1007/s004250050485.PubMedView ArticleGoogle Scholar
- Nakabayashi K, Ito M, Kiyosue T, Shinozaki K, Watanabe A: Identification of clp genes expressed in senescing Arabidopsis leaves. Plant Cell Physiol. 1999, 40: 504-514.PubMedView ArticleGoogle Scholar
- Adam Z, Adamska I, Nakabayashi K, Ostersetzer O, Haussuhl K, Manuell A, Zheng B, Vallon O, Rodermel SR, Shinozaki K, Clarke AK: Chloroplast and mitochondrial proteases in Arabidopsis. A proposed nomenclature. Plant Physiol. 2001, 125: 1912-1918. 10.1104/pp.125.4.1912.PubMedPubMed CentralView ArticleGoogle Scholar
- Dougan DA, Reid BG, Horwich AL, Bukau B: ClpS, a substrate modulator of the ClpAP machine. Mol Cell. 2002, 9: 673-683. 10.1016/S1097-2765(02)00485-9.PubMedView ArticleGoogle Scholar
- Lupas AN, Koretke KK: Bioinformatic analysis of ClpS, a protein module involved in prokaryotic and eukaryotic protein degradation. J Struct Biol. 2003, 141: 77-83. 10.1016/S1047-8477(02)00582-8.PubMedView ArticleGoogle Scholar
- Sakamoto W: Protein Degradation Machineries in Plastids. Annu Rev Plant Biol. 2006Google Scholar
- Besche H, Zwickl P: The Thermoplasma acidophilum Lon protease has a Ser-Lys dyad active site. Eur J Biochem. 2004, 271: 4361-4365. 10.1111/j.1432-1033.2004.04421.x.PubMedView ArticleGoogle Scholar
- Botos I, Melnikov EE, Cherry S, Tropea JE, Khalatova AG, Rasulova F, Dauter Z, Maurizi MR, Rotanova TV, Wlodawer A, Gustchina A: The catalytic domain of Escherichia coli Lon protease has a unique fold and a Ser-Lys dyad in the active site. J Biol Chem. 2004, 279: 8140-8148. 10.1074/jbc.M312243200.PubMedView ArticleGoogle Scholar
- Rotanova TV, Melnikov EE, Khalatova AG, Makhovskaya OV, Botos I, Wlodawer A, Gustchina A: Classification of ATP-dependent proteases Lon and comparison of the active sites of their proteolytic domains. Eur J Biochem. 2004, 271: 4865-4871. 10.1111/j.1432-1033.2004.04452.x.PubMedView ArticleGoogle Scholar
- Kikuchi M, Hatano N, Yokota S, Shimozawa N, Imanaka T, Taniguchi H: Proteomic analysis of rat liver peroxisome: presence of peroxisome-specific isozyme of Lon protease. J Biol Chem. 2004, 279: 421-428. 10.1074/jbc.M305623200.PubMedView ArticleGoogle Scholar
- Lee JR US: Regulated intracellular ligand transport and proteolysis controls EGF signal activation in Drosophila. Cell. 2001, 107: 161-171. 10.1016/S0092-8674(01)00526-8.PubMedView ArticleGoogle Scholar
- Urban S, Lee JR, Freeman M: Drosophila rhomboid-1 defines a family of putative intramembrane serine proteases. Cell. 2001, 107: 173-182. 10.1016/S0092-8674(01)00525-6.PubMedView ArticleGoogle Scholar
- Koonin EV, Makarova KS, Rogozin IB, Davidovic L, Letellier MC, Pellegrini L: The rhomboids: a nearly ubiquitous family of intramembrane serine proteases that probably evolved by multiple ancient horizontal gene transfers. Genome Biol. 2003, 4: R19-10.1186/gb-2003-4-3-r19.PubMedPubMed CentralView ArticleGoogle Scholar
- Kanaoka MM, Urban S, Freeman M, Okada K: An Arabidopsis Rhomboid homolog is an intramembrane protease in plants. FEBS Lett. 2005, 579: 5723-5728.PubMedView ArticleGoogle Scholar
- Freeman M: Proteolysis within the membrane: rhomboids revealed. Nat Rev Mol Cell Biol. 2004, 5: 188-197. 10.1038/nrm1334.PubMedView ArticleGoogle Scholar
- Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W: GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiol. 2004, 136: 2621-2632. 10.1104/pp.104.046367.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen G, Bi YR, Li N: EGY1 encodes a membrane-associated and ATP-independent metalloprotease that is required for chloroplast development. Plant J. 2005, 41: 364-375.PubMedView ArticleGoogle Scholar
- Woltering EJ: Death proteases come alive. Trends Plant Sci. 2004, 9: 469-472. 10.1016/j.tplants.2004.08.001.PubMedView ArticleGoogle Scholar
- Yamada K, Matsushima R, Nishimura M, Hara-Nishimura I: A slow maturation of a cysteine protease with a granulin domain in the vacuoles of senescing Arabidopsis leaves. Plant Physiol. 2001, 127: 1626-1634. 10.1104/pp.127.4.1626.PubMedPubMed CentralView ArticleGoogle Scholar
- Koizumi M, Yamaguchishinozaki K, Tsuji H, Shinozaki K: Structure and Expression of 2 Genes That Encode Distinct Drought-Inducible Cysteine Proteinases in Arabidopsis-Thaliana. Gene. 1993, 129: 175-182. 10.1016/0378-1119(93)90266-6.PubMedView ArticleGoogle Scholar
- Moreau C, Aksenov N, Lorenzo MG, Segerman B, Funk C, Nilsson P, Jansson S, Tuominen H: A genomic approach to investigate developmental cell death in woody tissues of Populus trees. Genome Biol. 2005, 6: R34-10.1186/gb-2005-6-4-r34.PubMedPubMed CentralView ArticleGoogle Scholar
- Beers EP, Woffenden BJ, Zhao C: Plant proteolytic enzymes: possible roles during programmed cell death. Plant Mol Biol. 2000, 44: 399-415. 10.1023/A:1026556928624.PubMedView ArticleGoogle Scholar
- Beers EP, Jones AM, Dickerman AW: The S8 serine, C1A cysteine and A1 aspartic protease families in Arabidopsis. Phytochemistry. 2004, 65: 43-58. 10.1016/j.phytochem.2003.09.005.PubMedView ArticleGoogle Scholar
- Gan S, Amasino RM: Inhibition of leaf senescence by autoregulated production of cytokinin. Science. 1995, 270: 1986-1988. 10.1126/science.270.5244.1986.PubMedView ArticleGoogle Scholar
- Noh YS, Amasino RM: Regulation of developmental senescence is conserved between Arabidopsis and Brassica napus. Plant Mol Biol. 1999, 41: 195-206. 10.1023/A:1006389803990.PubMedView ArticleGoogle Scholar
- Solomon M, Belenghi B, Delledonne M, Menachem E, Levine A: The involvement of cysteine proteases and protease inhibitor genes in the regulation of programmed cell death in plants. Plant Cell. 1999, 11: 431-444. 10.1105/tpc.11.3.431.PubMedPubMed CentralView ArticleGoogle Scholar
- Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Charbonnel-Campaa L, Lindvall JJ, Tandre K, Strauss SH, Sundberg B, Gustafsson P, Uhlen M, Bhalerao RP, Nilsson O, Sandberg G, Karlsson J, Lundeberg J, Jansson S: A Populus EST resource for plant functional genomics. Proc Natl Acad Sci U S A. 2004, 101: 13951-13956. 10.1073/pnas.0401641101.PubMedPubMed CentralView ArticleGoogle Scholar
- Ewing R, Poirot O, Claverie JM: Comparative analysis of the Arabidopsis and rice expressed sequence tag (EST) sets. In Silico Biol. 1999, 1: 197-213.PubMedGoogle Scholar
- Andersson A, Keskitalo J, Sjodin A, Bhalerao R, Sterky F, Wissel K, Tandre K, Aspeborg H, Moyle R, Ohmiya Y, Brunner A, Gustafsson P, Karlsson J, Lundeberg J, Nilsson O, Sandberg G, Strauss S, Sundberg B, Uhlen M, Jansson S, Nilsson P: A transcriptional timetable of autumn senescence. Genome Biol. 2004, 5: R24-10.1186/gb-2004-5-4-r24.PubMedPubMed CentralView ArticleGoogle Scholar
- Wissel K PF: What affects mRNA levels in leaves of field-grown aspen? - A study of developmental and environmental influences. Plant Physiology. 2003, 133: 1190-1197. 10.1104/pp.103.028191.PubMedPubMed CentralView ArticleGoogle Scholar
- Segerman B, Jansson S, Karlsson J: Characterization of genes with narrow expression patterns in Populus. Tree Genetics & Genomes. 2006.Google Scholar
- Nixon PJ, Barker M, Boehm M, de Vries R, Komenda J: FtsH-mediated repair of the photosystem II complex in response to light stress. J Exp Bot. 2005, 56: 357-363. 10.1093/jxb/eri021.PubMedView ArticleGoogle Scholar
- Yu F, Park S, Rodermel SR: Functional redundancy of AtFtsH metalloproteases in thylakoid membrane complexes. Plant Physiol. 2005, 138: 1957-1966. 10.1104/pp.105.061234.PubMedPubMed CentralView ArticleGoogle Scholar
- Populus trichocarpa DB. [http://genome.jgi-psf.org/Poptr1/Poptr1.home.html].
- Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P: SMART 4.0: towards genomic data integration. Nucleic Acids Res. 2004, 32: D142-4. 10.1093/nar/gkh088.PubMedPubMed CentralView ArticleGoogle Scholar
- InterProScan. [http://www.ebi.ac.uk/InterProScan/].
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25: 4876-4882. 10.1093/nar/25.24.4876.PubMedPubMed CentralView ArticleGoogle Scholar
- Kumar S, Tamura K, Jakobsen IB, Nei M: MEGA2: molecular evolutionary genetics analysis software. Bioinformatics. 2001, 17: 1244-1245. 10.1093/bioinformatics/17.12.1244.PubMedView ArticleGoogle Scholar
- PopulusDB: [http://www.populus.db.umu.se].
- Sjodin A, Bylesjo M, Skogstrom O, Eriksson D, Nilsson P, Ryden P, Jansson S, Karlsson J: UPSC-BASE--Populus transcriptomics online. Plant J. 2006, 48: 806-817. 10.1111/j.1365-313X.2006.02920.x.PubMedView ArticleGoogle Scholar
- Ihaka R, Gentlemen R: R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics. 1996, 5 (3): 299-314. 10.2307/1390807.Google Scholar
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, et al: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34: 374-378.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.