Overview of RNA-Sequencing gene expression patterns in male and female Populus tremula trees from the Umeå Aspen collection. (a) Principal Component Analysis plot of the RNA-Sequencing (RNA-Seq) expression data with samples classified by sex (male in blue, female in pink) and by year of sampling (2008 as squares and 2010 as triangles). The percentage variance explained by each component is shown in parenthesis for each axis. (b) Volcano plot of the negative log10 p-value (y-axis) (i.e. the log odds ratio) plotted against log2-fold change (x-axis) showing the results of differential expression analysis assayed using RNA-Seq comparing male to female trees. The statistical model included factors for year of sampling and sex and the effect of sex was tested after removal of the year effect. Significant genes are shown in blue where expression was higher in males. For the two significant genes at a 1% False Discover Rate (FDR) cut-off, the obtained p-value was <1e-10 and was therefore set to 0. As a result the log odds value is infinite and was therefore replaced with the next largest log odds +1. Non-significant genes are coloured to indicate density, which is shaded from yellow (high) to blue (low). The dashed horizontal line represents a 1% FDR. The four genes with the smallest p-values (regardless of significance) and the four genes with the highest and lowest non-significant fold change values are circled in red. These genes are represented in Additional file 2. The gene identifiers for the two statistically significant genes are shown (identifiers refer to V3 of the P. trichocarpa genome).