Pipeline for the assembly procedure of Roche/454 sequence reads. After data generation [A], sequence (fasta), quality (qual) and trace file information were extracted. Low quality regions, vector and adaptor sequences were removed from raw reads [B]. Preprocessing was finished by subjecting trimmed reads to the line-specific assembly. For establishment of the SNP resource Sce_Assembly02 [C] only reads assembled in contigs of line-specific assemblies were subjected to the merging process of the second assembly using Mira. For establishment of the EST resource Sce_Assembly03 [D] assemblies were computed for each of the five lines separately with CLC assembly cell, Mira, and Newbler and merged by CAP3 assembly. Consensus sequences of all lines were passed to a second CAP3 assembly combining sequences over multiple lines. The resulting sequence set comprises contigs that were confirmed by consensus sequences from two to five lines (multi-line contigs) or contigs that contain reads originating from one line (single-line contigs).