As the next-generation DNA sequencing is becoming more quicker and inexpensive, vast amounts of sequence data is now being generated exponentially and publicly available, including large number of ESTs from different plant species. These sequences represent a potentially useful resource for mining SSR markers. In this study, we have identified 4,609 date palm EST sequences containing SSRs from a total of 28,889 sequences. The frequency (16%) of SSRs in genic sequences of date palm was lower when compared to other plant species. For instance, the frequency of SSRs detected was 33.3% in citrus , 28.4% in castor bean , 24% in Iris. However, this frequency in data palm (16%) is greater than those detected in oil palm with 6.1% . The SSR density in date palm is one per 2.4 kb, which is also lower than other plant species (one per 1.97 kb in citrus, one per 1.77 kb in castor bean). However, the frequency of SSRs is depended on the criteria used to identify SSRs in the EST sequence database.
The most common dinucleotide SSR motif was AG which comprised of 85.7% dinucleotide motifs in date palm EST sequences. The motif AG is the most abundant and highly polymorphic in both annual and perennial plants including apple and citrus [30, 34]. Mun et al.  have compared the frequency of motif AG in ESTs vs genomic sequences, and found that the higher frequency of motif AG in EST than in genomic sequences, for M. truncatular, soybean, L. japonicus, Arabidopsis, and rice. Among trinucleotide SSR motifs in date palm, AGG and AAG were the more abundant than other types, while in tetranucleotide SSR motifs, AAAG (19.2%), AAGG (14.3%), and AGGG (13.3%) were more common than other types. Although the role of the SSR motif in the function of plant genes is poorly understood, there is evidence showing that motif AG in the 5’ UTR of the waxy gene is related to the amylase content in rice and motif CCG in 5’ UTR in ribosomal protein genes involved in the regulation of fertilization in maize . In date palm, the AG rich content existing genic SSRs and the role of these motifs in the function of genes containing SSRs needs to be further investigated.
Putative functional annotation and categorization of EST sequences containing SSRs in this study revealed that these sequences are involved in various aspects of date palm development. The majorities of transcripts were assigned with “cell” and “organelle” in the cellular component category, involved in “binding” and “catalytic activity” in the molecular function category, and involving in “cellular activity” and “metabolic activity” in the biological process in date palm. Similar results were reported in citrus .
Trinucleotide SSRs were the most common, followed by tetra- and dinucleotide SSRs in date palm EST sequences, which is consistent with the most cases in other plant species. The abundance of trinucleotide SSRs in EST sequences was attributed to the tolerance of frameshift mutations in coding regions . There is evidence that EST-SSRs located in coding regions appear to reveal equivalent levels of polymorphism as compared to those located in UTRs . Thus, EST sequences are indeed an excellent resource for mining SSRs in date palm.
While there are a few reports on SSR markers from genomic sequences in date palm, only 56 genomic SSR markers have been identified [27–29]. Increased availability of these markers would aid in the genetic and genomic studies in date palm as they are better tools than RAPD markers because of their co-dominant inheritance, multi-allelic nature, and high reproducibility [5, 37–41]. In this study, we report identification of a vast number (4,967) of EST-SSR markers in date palm. Using 20 randomly selected markers, we detected 6 (30%) as identifying polymorphism on a panel of one dozen date palm cultivars. This approach may hold promise for development of a substantial number of informative high-density EST-SSR markers in date palm, large enough to be of value in breeding. These novel markers will not only uplift the repertoire of DNA markers to enrich the genetic and genomic tools, but also facilitate further genomic research in date palm, such as comparative mapping, molecular breeding, and gene cloning because they are derived from transcripts. Such expression profiling can also used to identify agronomically relevant genes based on synteny relationships between plant genomes .
Applications and potential uses of EST-SSR markers in plants have proved to be useful in the assessment of genetic diversity [33, 42], and also valuable in the identification of gene-inked markers [43, 44]. In date palm, lack of gene-related markers has so far limited the application of molecular breeding of this crop. Identification of marker related to gender is especially important in date palm farming as such markers help in easier elimination of male plants. Al-Dous et al.  have identified a vast amount of SNPs and a region of the date palm genome linked to gender. The EST-SSRs we reported in this study can potentially be an useful genomic tool in addition to SNPs as they provide a potential resource for association mapping of gender related genes as well as other traits of interest. The large number of gene-based markers can be used in comparative mapping to study colinear order of genes and synteny among close related date palm species due to their high transferability. They can also be utilized to understand genetic diversity in different oases and populations in date palm for its conservation and sustainable use. Once molecular markers linked to desirable traits are identified, marker-assisted selection in breeding will facilitate genetic improvement of this valuable crop species.