The cultivated Brassica species are the group of crops most closely related to Arabidopsis thaliana. They are members of the Brassicaceae (sometimes referred to as the Crucifereae) family . The species typically termed the "diploid" Brassica species, B. rapa (n = 10), B. nigra (n = 8) and B. oleracea (n = 9) contain the A, B and C genomes, respectively. Each pairwise combination has hybridized spontaneously to form the three allotetraploid species , B. napus (n = 19, comprising A and C genomes), B. juncea (n = 18, comprising A and B genomes) and B. carinata (n = 17, comprising B and C genomes). The genome of B. rapa is the smallest, at ca. 500 Mb , and a genome sequencing project is under way, with both sequences and sequence annotations in the public domain http://brassica.bbsrc.ac.uk/
The lineages of B. rapa and B. oleracea diverged ca. 3.7 Mya  and genetic mapping has confirmed that the overall organisation of their genomes is highly collinear . Their hybridisation to form B. napus probably occurred during human cultivation, i.e. less than 10,000 years ago. Comparative genetic mapping showed that the progenitor A and C genomes in B. napus have undergone little or no gross rearrangement during that time  and also revealed extensive duplication within the Brassica genomes . Recent cytogenetic studies have shown that a distinctive feature of the Brassiceae tribe, of which the Brassica species are members, is that they contain extensively triplicated genomes .
Even at the resolution of linkage maps, extensive collinearity can be identified between the genomes of Brassica species and A. thaliana. For example, a landmark study using sequenced RFLP markers demonstrated that 21 segments of the genome of A. thaliana, representing almost its entirety, could be replicated and rearranged to generate a structure approximating that of the B. napus genome . A study across the Brassicaceae subsequently identified 24 conserved chromosomal blocks, relating them to a proposed ancestral karyotype of n = 8 . A number of genome analyses have been conducted in B. oleracea, B. rapa and B. napus using physical mapping techniques. The results have shown that the diploid Brassica genomes contain extensive triplication, consistent with their having evolved from a hexaploid ancestor [10–12]. Two sequence-level studies, one in B. oleracea  and one in B. rapa  have provided further support for the hypothesis of hexaploid ancestry for the Brassica species. If this hypothesis were true, the duplicate genes we observe in the extant diploid genomes would formally be "paleo-homoeologues". However, here we will use the more general term paralogue, which is free of this assumption, to clearly delineate from the recognisable homoeologues in B. napus arising from the very recent hybridisation of the A and C genomes. The studies using physical mapping and sequencing approaches showed that, although sets of three related genome segments (paralogues) will often be identifiable within the genome of the diploid Brassica species, a proportion of the genes in these segments will have been lost.
Brassica polyploids can be synthesised artificially. For example, B. napus can be resynthesised by hybridization of B. rapa and B. oleracea. However, it has been found that such lines display genome instability , which can persist for many generations and is thought to involve homoeologous non-reciprocal translocations. They have been shown to be correlated with qualitative changes in the expression of specific genes and with phenotypic variation .
Microarrays have become a widely-used tool for transcriptome analysis in plants. Essentially, they consist of an immobilised array of DNA sequences (probes) which are hybridized in situ using fluorescently-labelled sequences (targets) derived by reverse transcription of polyadenylated transcripts. Imaging of the hybridized array, followed by computational analysis of the signal intensity data, leads to a quantification of the transcript abundance, in the sampled tissue, of the genes represented by the probes in the array. There are numerous microarray platforms available and they have been applied to a wide range of studies in plant biology, reviewed by Galbraith .
As the Brassica species diverged from A. thaliana only ca. 17 Mya , exon sequences show a high level of conservation, ca. 85% at the nucleotide level . Therefore some types of microarrays designed for use in A. thaliana can be used for the analysis in Brassica of the related genes. However, an analysis of ca. 100,000 Brassica EST sequences showed that ca. 9% showed no similarity with any gene in A. thaliana . A. thaliana-based microarrays therefore would fail to measure the expression of a significant number of Brassica genes. In addition, Brassica genomes show extensive triplication, with the sub-genomes estimated to have diverged ca. 14 Mya [13, 14, 18]. A. thaliana-based microarrays would lack the capability to resolve the contributions to the transcriptome of such families of paralogous genes. Consequently, a number of groups have developed Brassica cDNA-based microarrays, but these have been based upon relatively modest EST collections and none are available as community resources. We aimed to address this deficiency by developing a microarray based upon all public EST data, validating its utility for transcriptome analysis across multiple Brassica species, and placing it in the public domain. The validation experiment involved transcriptome analysis in two "resynthesised" B. napus lines and their B. rapa and B. oleracea progenitors. This experimental design enables the identification of both species-specific and genome-specific expression, whilst the long oligonucleotides used essentially eliminate the possible complications due to allelic variation (SNPs and small indels).