= Satellite meeting for Sequence analysis tools from the next generation sequencer = == Topics == To share our knowledges about sequence analysis tools. Please write what you know about the tools. Also see [http://www.oxfordjournals.org/our_journals/bioinformatics/nextgenerationsequencing.html Bioinformatics for Next Generation Sequencing virtual issue] -- yaskaz@ddbj (thanks to ichan) and [http://www.nature.com/nbt/journal/v26/n10/fig_tab/nbt1486_T3.html Table 3 of Jay Shendure & Hanlee Ji., Next-generation DNA sequencing, Nature Biotech.] (thanks to hinaichigo) == Mapping tools == * Maq * ELAND * [http://bowtie-bio.sourceforge.net/index.shtml BOWTIE] Ultra-short reads, Burrows-Wheeler index. * [http://www.dnastar.com/ Lasergene] Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds) * [http://www.454.com/products-solutions/analysis-tools/gs-reference-mapper.asp GSMapper] * Map reads to any reference genome and generate a consensus sequence. * Easily view all differences compared to the reference sequence with automatic output to separate files: Insertions (blocks up to 50 bases), Deletions (blocks up to 50 bases), SNPs * Quickly identify high confidence difference compared to the reference genome, which are singled out in a separate file. * Compare large, complex genomes of any size including: resequencing of whole genomes from humans, plants, yeasts, bacteria, fungi, viruses, YACs, BACs, fosmids * Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (consensus alignment of the reads against a given reference sequence) * [http://compbio.cs.toronto.edu/shrimp/ SHRiMP] supports letter-space (454/Solexa), color-space (Solid) and Helicos 2-pass space. Spaced-seeds followed by S-W. * [http://bx.psu.edu/miller_lab/dist/README.lastz-1.01.50/README.lastz-1.01.50.html Lastz] Blast(z) variant. Has parameter settings tuned for 454 and Solexa mapping for both reference based assembly and variant discovery. * [http://emboss.open-bio.org/wiki/Next-Generation_Sequencing_Data EMBOSS tools for NGS] * [http://last.cbrc.jp/ LAST] many mores.... but never used. == Assemble tools == * [http://www.454.com/products-solutions/analysis-tools/gs-de-novo-assembler.asp GSAssembly] * Performs whole genome shotgun assembly of genomes with or without paired-end data. * Order contigs into scaffolds using supported paired-end reads. * Assemble larger, more complex genomes up to 400 megabases in size with a 64-bit assembler or up to 20 megabases with a 32-bit assembler. * Co-assemble with Sanger Sequencing reads. * Select the read files you want to assemble by browsing in the GUI. * Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (alignment of the reads to contig sequence) * [http://www.dnastar.com/ Lasergene] Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds) * [http://www.ebi.ac.uk/~zerbino/velvet/ Velvet] * [http://emboss.open-bio.org/wiki/Next-Generation_Sequencing_Data EMBOSS tools for NGS] many mores.... but never used. == Platforms == * [http://www.454.com/ Roche, 454] * [http://www.illumina.com/pages.ilmn?ID=203 Illumina, Solexa] * [http://www.appliedbiosystems.co.jp/website/jp/home/index.jsp AB, Solid] == Notes == * Every new software must handle this "huge" amount of data. * Some veriations SNP, deletion, insertion can by identified during mapping/assembly process, other things more "new biology" needs more work by bionformaticians. * Support from biologists is required in order to discover more new things. == TODOs == Connected to SatelliteBigData a LIMS infrastructure is suggested for tracking history and infer problems coming from wet-lab. From SatelliteVisualization these tools could be very usefull to explore at any level all the informations collected, from the alignment (lower) up to annotation/comparison (higher). The informations are intended as local (faltfile/databases/webservices) or remote (databases/webservices). Editing in real time, providing alternatives for hypothesis. Easy tools to integrate informations coming from different platforms (illumina, 454, solid).