Assessment of allelic diversity in intron-containing Mal d 1 genes and their association to apple allergenicity

Background Mal d 1 is a major apple allergen causing food allergic symptoms of the oral allergy syndrome (OAS) in birch-pollen sensitised patients. The Mal d 1 gene family is known to have at least 7 intron-containing and 11 intronless members that have been mapped in clusters on three linkage groups. In this study, the allelic diversity of the seven intron-containing Mal d 1 genes was assessed among a set of apple cultivars by sequencing or indirectly through pedigree genotyping. Protein variant constitutions were subsequently compared with Skin Prick Test (SPT) responses to study the association of deduced protein variants with allergenicity in a set of 14 cultivars. Results From the seven intron-containing Mal d 1 genes investigated, Mal d 1.01 and Mal d 1.02 were highly conserved, as nine out of ten cultivars coded for the same protein variant, while only one cultivar coded for a second variant. Mal d 1.04, Mal d 1.05 and Mal d 1.06 A, B and C were more variable, coding for three to six different protein variants. Comparison of Mal d 1 allelic composition between the high-allergenic cultivar Golden Delicious and the low-allergenic cultivars Santana and Priscilla, which are linked in pedigree, showed an association between the protein variants coded by the Mal d 1.04 and -1.06A genes (both located on linkage group 16) with allergenicity. This association was confirmed in 10 other cultivars. In addition, Mal d 1.06A allele dosage effects associated with the degree of allergenicity based on prick to prick testing. Conversely, no associations were observed for the protein variants coded by the Mal d 1.01 (on linkage group 13), -1.02, -1.06B, -1.06C genes (all on linkage group 16), nor by the Mal d 1.05 gene (on linkage group 6). Conclusion Protein variant compositions of Mal d 1.04 and -1.06A and, in case of Mal d 1.06A, allele doses are associated with the differences in allergenicity among fourteen apple cultivars. This information indicates the involvement of qualitative as well as quantitative factors in allergenicity and warrants further research in the relative importance of quantitative and qualitative aspects of Mal d 1 gene expression on allergenicity. Results from this study have implications for medical diagnostics, immunotherapy, clinical research and breeding schemes for new hypo-allergenic cultivars.


Background
Many birch pollen sensitised patients (50-70%) in Central and Northern Europe suffer from oral allergy symptoms after eating fresh apples [1]. The prevalence of apple allergic individuals mounts up to ~3% in Central and Northern Europe. This type of apple allergy is caused by cross reactivity of IgE antibodies against the major and sensitizing birch pollen allergen Bet v 1 with Mal d 1, the major allergen of apple. Bet v 1 and Mal d 1 are both pathogenesis-related (PR) proteins. They belong to the PR-10 family and share a high degree of homology [2][3][4][5][6].
From patients' experience it is known for a long time that the severity of allergic reactions to apple was not only related to the specific sensitivity of the individual, but also largely depended on the apple cultivar. This cultivar dependent allergenicity has also been described in literature. For instance, Mal d 1 from the cultivar Golden Delicious was found highly reactive to specific IgE antibodies from allergic patients' sera, whereas Mal d 1 from the cultivar Gloster generally showed much less reactivity [7,8]. In addition, skin prick testing (SPT) with 21 different apple cultivars and confirmations for specific cultivars in double-blind placebo-controlled food challenges (DBPCFC) and oral challenges of whole apples, revealed a wide range of allergenic reactivity from very high to very low [9,10]. As a result from these studies, the new cultivar Santana was identified as hypo-allergenic for 75% of the patients with a mild apple allergy [10], which is usually assumed to be Mal d 1 based. In The Netherlands (where birch pollen-related apple allergy is by far the most common form of apple allergy), this cultivar has recently been marketed as 'suited for individuals with mild apple allergy' in order to meet the general desire of apple allergic persons to be able to add this common fruit to their daily diet.
The differences in allergenicity among cultivars raised a crucial question on the origin of this cultivar-specific degree of allergenicity. Allergenicity may depend on the total amount of Mal d 1 proteins, as suggested by Son et al. [11] from their observed ten-fold difference in Mal d 1 amount between the high-allergenic cultivar Golden Delicious and the low-allergenic cultivar Gloster. However, there is little evidence supporting this hypothesis because only very few cultivars have been studied and, for these, a linear response between Mal d 1 protein content and allergenicity estimates is lacking. On the other hand, qualitative characteristics of the Mal d 1 proteins could be involved too, as can be argued from the differences in binding capacity of birch pollen-specific IgE to two protein variants of Mal d 1 [11][12][13]. To elaborate this latter issue, research on the genetic variation of Mal d 1 and its expression pattern in the different cultivars is required and should be compared to allergenicity data. It is known that Mal d 1 is coded by a large gene family of 18 members mapped on three linkage groups of the apple genome [14,15]. Not all of these members are likely to be involved in allergenicity since only a limited number of different Mal d 1 proteins and mRNAs have been traced back in apple fruit so far [16][17][18].
Research towards the relative importance of the quality and quantity of Mal d 1 proteins on the allergenicity of apple cultivars is relevant for designing apple breeding programs for low-allergenic apple cultivars of high quality and healthiness. In this paper, we focused on the genetic diversity of Mal d 1 genes. The Mal d 1 gene family can be subdivided into two major categories: genes with and genes without an intron. Preliminary genetic analyses revealed that the genetic diversity was by far larger in the intron-containing genes. Furthermore, the intron-containing genes cover all three linkage groups that Mal d 1 loci [15]. Therefore, this category of genes has been chosen to start looking for putative qualitative effects of Mal d 1 proteins in cultivar specific allergenicity.
Allelic diversity of the seven intron-containing Mal d 1 genes was assessed among a set of cultivars chosen for their importance in breeding programs and apple production. In order to find putative associations with allergenicity, the presence of alleles coding for different protein variants was subsequently compared with the degree of allergenicity for a subset of cultivars for which allergenicity data from SPT or DBPCFC tests were available.

Diversity of Mal d 1 genes and deduced proteins
The observed DNA polymorphisms in the 10 studied cultivars resulted in a total of 46 different Mal d 1 sequences over seven genes (Table 1, 2). These sequences were denoted according to the occurrence of 1) DNA polymorphisms in the coding region of the gene leading to different protein variants; 2) polymorphism in the coding region that did not affect the protein sequence (silent mutations), and 3) polymorphism in the intron (Table 1, 2). Although the latter two differences are of minor importance with respect to allergenicity, they provided additional landmarks for the development of sequence specific molecular markers.
Mal d 1.01 and -1.02 showed to be highly conserved at the protein level. The related genes coded for only two variants each that both differed in just a single amino acid (pos. 135 V/A for Mal d 1.01, pos. 56 N/K for Mal d 1.02) and for which the second variant was found only once. The other genes were more variable, coding for three (Mal d 1.05) up to six variants (Mal d 1.06C). Mal d 1.04 showed to be special in that two (out of the three) sequences contained a stop codon in the coding region and were therefore regarded as pseudo alleles (ps1 and ps2). Interestingly, the pseudo alleles occurred frequently as for seven out of the ten cultivars at least one of the alleles was a pseudo allele whereas cultivars Priscilla and Fuji only contained pseudo alleles of Mal d 1.04 (Table 2).

Allergenicity scores of 14 apple cultivars by skin prick test (SPT)
Relative SPT responses of 14 apple cultivars are given in Table 3. Fiesta, Delblush, Pinova and Golden Delicious were ranked in the high allergenic group (83-100%). Priscilla and Santana showed low SPT responses, with wheal areas 30-35% of that of Golden Delicious. Nine cultivars were intermediate (48-72%) allergenic. Santana was also identified as low-allergenic in comparison to Golden Delicious in DBPCFC tests [9] and oral provocation tests [10].

General associations from the sequenced cultivars
From the cultivars used to sequence the intron containing Mal d 1 genes and to perform SPT on allergenicity, Golden Delicious was ranked as the highest allergenic cultivar whereas Priscilla was ranked as the lowest (

Association analysis by pedigrees: from Golden Delicious to Santana
The identity and origin of genomic alleles and thus protein variants in additional cultivars (not sequenced for Mal d 1) could be traced by developed allele specific SNAP and SSR markers and the use of pedigree information [19] (see Methods). For instance, the deduced flow of protein variants over the pedigree of cultivar Santana is presented in Figure 1. Santana and Priscilla are low allergenic whereas Golden Delicious is high allergenic [9,10]. For Mal d 1.01, 1.02 and 1.05 the same protein variants were found for Golden Delicious and Santana. In contrast, Golden Delicious and Santana differ in their protein variant composition of Mal d.1.04, -1.06A and -1.06B (Fig. 1), indicating a possible involvement of these proteins in the observed difference in allergenicity between these cultivars. Santana, like Priscilla, has only pseudo alleles for Mal d 1.04 that do not result in protein production, while

Discussion
Birch pollen induced oral allergy for apple has been the subject in a considerable number of studies. One of the prominent results has been the presence of cultivar-specific differences in allergenicity. Unfortunately, evidence regarding the causes of cultivar-specific allergenicity is still lacking. One of the knowledge gaps concerned the number and identity of Mal d 1 genes and the amount of variation within these genes. Recently, Gao et al. [15] have shown that Mal d 1 genes are members of a large gene family by identifying 18 different loci that are located in three clusters. Based on sequence identity, these 18 genes could be subdivided into intron containing and intronless genes. In order to create a basis for a better understanding of the genetics of Mal d 1 genes and their impact on allergenicity, we have studied the allelic diversity of the intron containing genes in 10 cultivars that are often used in breeding. Development of sequence specific markers and pedigree information enabled the assessment of putative Mal d 1 constitutions of other cultivars. Using this information, we assessed the different Mal d 1 isoforms that cultivars are able to produce and found associations between their putative protein constitutions and SPTresponses.

Allelic diversity and validity of database sequences
Cloning and sequencing of the seven intron- Because the examined cultivars are important in the breeding of many modern apple varieties, the set of alleles found in this study likely represents a considerable part of the total variation present in intron containing Mal d 1 genes of common apple varieties.
Although other Mal d 1 sequences are known from public databases, we suspect that many of these sequences may be artefacts derived through strand switching and PCR mutations. The problem associated with PCR amplification of a group of closely related sequences, such as the Mal d 1 gene family, is that besides PCR induced single base pair mutations, in vitro strand switching or reannealing of incompletely amplified fragments can lead to artefacts as was exemplified by Schenk et al. [20] for birch Bet v1 sequences. For instance, for Mal d 1.01 one of the most studied Mal d 1 genes, over 13 DNA sequences from previous studies are known from public databases (Table 5) indicating the presence of 9 putative protein isoforms. We know Mal d 1.01 is a single locus gene with maximum two alleles present in a cultivar [15], but four sequences from Golden Delicious can be found in databases. From these, only sequence accession AF124830 was identical to one of our two sequences (Mal d 1.0105.01b). The other sequences may be due to artefacts. Firstly, accession AF126402 had one SNP at position 11 (G→A) compared to AF124830, which is due to the cloning primer used. Similarly, sequences from a number of other cultivars showed this 11A mutation too. In our study, the cloning primers used were positioned in the 5'-untranslated region thus avoiding this problem. Secondly, Accession   Consensus nucleotide

Mal d 1.02-CONS d
a Alleles in bold are confirmed by our own sequences (see Table ). Numbers in brackets indicate Genbank accessions of previous sequences. b Cultivar abbreviations additional to those in Table 2: GA-Gala, JB-Jamba, GL-Gloster, GD-seedling: seedling from Golden Delicious. Cultivar names in bold indicate material from this study. c Position refers to the coding sequence and is presented vertically. SNP nucleotides given in bold are our own observations. SNPs in italic are identified errors after cross checking. The occurrence of PCR recombination and mutations in sequences from gene families warrants scrutinised assessment of sequences. The use of two independent PCR-cloning steps for each cultivar may effectively filter out most of these erroneous sequences before database donation since the probability of isolating identical artefacts in independent PCRs is low [20]. Sequence specific markers may be used to validate newly found isoforms that have passed this first sifting. Many of the sequences found in this study were either confirmed by identical sequences retrieved from the other cultivars used or by identical sequences previously donated in the databases as well as through the use of sequence specific markers [ [15], this study]. The above described associations could be found due to the presence of allelic variation among the examined apple cultivars and due to performing a complexity reduction of the human variation by only analysing patients with mild SPT responses, thus reducing the effect of variation among humans for sensitivity to different allergens. Studies with larger patient sizes may probably benefit from further grouping to also account for genetically determined human variation in sensitivity to different (iso) allergen variants. Such grouping has probably to be based on allergy responses as no knowledge exists on the involved human genes neither on their allelic composition.

Cultivar specific allergenicity and its relation to quantitative and qualitative differences in
The finding that allergenicity depends on the presence and amount of some specific Mal d 1 isoforms is highly relevant for diagnostics tests and immunotherapy, and justifies additional research on a larger number of apple cultivars as well as atopic individuals. Since the first Mal d 2 and Mal d 4 genes have also been recently mapped [22] and the mapping of additional genes of these allergens is in progress, it will become possible to also investigate the effects of allelic composition of these Mal d allergens on the allergenicity of cultivars by association studies.

Location of amino acid polymorphism in a 3D structure model
For Mal d 1.06A, high-allergenic cultivars have two putative genotypes, homozygous variant 01 or heterozygous variant 01 together with variant 03, whereas low-allergenic cultivars are homozygous for variant 02. The intermediate-allergenic cultivars contained the low allergenic variant 02 in combination with one of the high allergenic variants 01 or 03.
The three Mal d 1.06A variants differ at two amino acids: 13 V/I and 135V/A. Considering the three dimensional structure model of Mal d 1 [23], the first polymorphism is located in the first loop between the β1-strand and the α1helix, the second is located in the α3-helix structure motif. The amino acid changes are all between hydrophobic amino acids but they have different side chains that may have an effect on the 3D structure of the protein and thus on epitope conformation.

Expression of Mal d 1 genes in fruit
For specific Mal d 1 genes to be involved in allergenicity, expression in apple fruit is a prerequisite. Until now, mRNA expression for five genes was observed in mature fruit through both rtPCR [16,24] [17,18], the majority of Mal d 1 protein is Mal d 1.02 (Mal d 1b) and a minor part is Mal d 1.06A [15]. Interestingly, both genes are located on linkage group 16 where also Mal d 1.04 is located. These mRNA and protein data thus allow Mal d 1.06A to be involved in differences in allergenicity among cultivars. The current lack of support for the presence of Mal d 1.04 in fruit might indicate that the observed association is coincidental, but may as well be due to lack of extensive expression studies.

Genotyping for Mal d 1 haplotypes
The Mal d 1 genes in LG 16 are tightly linked to each other [15]. This tight linkage can simplify the genotyping of additional cultivars, at least if their pedigree and the linkage phase of their parental alleles are known. In these cases, genotyping can be performed by a single represent-ative, multi-allelic marker such as the Mal d 1.06A SSR marker. As linkage phases of the Mal d 1 genes of LG 16 are known for all 10 cultivars of our reference set but Discovery (Table 2), this simple and efficient approach was performed in this study for certain cultivars ( Figure. 1).

Conclusion
We have shown that differences in allergenicity among apple cultivars are associated with the allelic composition of two specific genes,

Cultivars for cloning and sequencing
Eight cultivars were used for cloning and sequencing of   Table 6). The PCR amplification, cloning and sequencing procedures were described previously [15,22]. For all 10 cultivars, eight to ten clones for each gene were sequenced in both directions. Next, sequences were aligned and putative Single Nucleotide Polymorphisms (SNPs) were identified using the Seqman program (DNAstar, Madison, WI). The coding sequences were deduced and translated into amino acid sequences for alignments and assessment of protein variant with the GeneDoc program http://www.psc.edu/ biomed/genedoc. New protein variants or gDNA alleles were named according to Gao et al. [15,22], following a modification of the allergen nomenclature guidelines [25].

Cultivars for association studies
Allergenicity data were available for 6 out of the 10 cultivars for which we assessed allelic diversity [9]. Besides these six cultivars, eight additional cultivars were included in the association study. For these eight cultivars allergenicity data were available [9] and their allelic constitutions of the intron-containing Mal d 1 genes could be assessed by their pedigree relationships to the set of 10 sequenced cultivars.
After an initial denaturation at 94°C for 2.5 min, the amplification was carried out for 34 cycles at 94°C for 30 s, 60°C for 30 s and 72°C for 1 min, and a final extension at 72°C for 5 min. PCR products were analysed on an ABI 377 (Applied Biosystems, Foster City, Calif.).
Using the SNAP makers, the pedigree structures allowed us to follow the flow of the Mal d 1 alleles over generations by applying the Identity by Descent principle in the genotyping of cultivars [19]. In total 14 cultivars were thus available for association of protein variant composition with SPT responses.

Allergenicity data
In this study, Skin Prick Test (SPT) responses were used to evaluate allergenicity. The SPT procedure and the history of the patients have been described previously [9]. In short, patients were recruited from the outpatient clinic of the department of Dermatology/Allergology of the UMCU. They all had birch pollinosis manifesting with rhinoconjunctivitis during the birch pollen season (April and May), as well as a positive SPT to fresh apple of at least half the diameter of the positive histamine control. All patients had a typical history of apple allergy, with oral allergy syndrome (OAS) symptoms like itching and mild swelling of the mouth, throat and sometimes rhinoconjuctivitis after eating an apple. SPT were performed on the flexor surface of the forearm using the prick-to-prick-technique according to Dreborg [28,29].
Histamine dihydrochloride (10 mg/ml) was used as a positive control, and the glycerol diluents of the SPTextracts were used as negative control (ALK-ABELLO, Nieuwegein, The Netherlands). The wheal reaction (a small, itching elevation of the skin, as from the bite of an insect) was marked and transferred with transparent adhesive tape to a record sheet. The skin wheal areas were measured by computer scanning [30]. SPT responses for each cultivar were standardized by dividing the original wheal area of the prick by that obtained from the reference cultivar for high allergenicity Golden Delicious and multiplied by 100. Data have been derived from four experiments, three of which had been published previously [9]. For each experiment we used only a fraction of the data, this is only the data of patients with mild symptoms as preliminary experiments indicated that Mal d 1 is the major allergen to these patients, while other Mal d proteins seem to be major allergens to patients displaying more severe symptoms (Van de Weg, unpublished). Consequently, only 25%-50% of the patients of the previous experiments [9] and only 50% of the patients (4 out of 8) of the forth, new experiment were included. Finally, 11 different patients were involved. In order to combine data from these different experiments, responses of cultivars were expressed as a percentage of the response against the reference cultivar Golden Delicious. The final ranking results were obtained by averaging the responses from four different experiments. This study was reviewed and approved by the Ethics Committee of the University Medical Center Utrecht under document number 01-050. All patients provided written informed consent before enrolment in the study.