Satellite meeting for Sequence analysis tools from the next generation sequencer ¶
Topics ¶
To share our knowledges about sequence analysis tools. Please write what you know about the tools.
Also see Bioinformatics for Next Generation Sequencing virtual issue -- yaskaz@ddbj (thanks to ichan) and Table 3 of Jay Shendure & Hanlee Ji., Next-generation DNA sequencing, Nature Biotech. (thanks to hinaichigo)
Mapping tools ¶
- Maq
- ELAND
- BOWTIE Ultra-short reads, Burrows-Wheeler index.
- Lasergene Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds)
- GSMapper
- Map reads to any reference genome and generate a consensus sequence.
- Easily view all differences compared to the reference sequence with automatic output to separate files: Insertions (blocks up to 50 bases), Deletions (blocks up to 50 bases), SNPs
- Quickly identify high confidence difference compared to the reference genome, which are singled out in a separate file.
- Compare large, complex genomes of any size including: resequencing of whole genomes from humans, plants, yeasts, bacteria, fungi, viruses, YACs, BACs, fosmids
- Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (consensus alignment of the reads against a given reference sequence)
- SHRiMP supports letter-space (454/Solexa), color-space (Solid) and Helicos 2-pass space. Spaced-seeds followed by S-W.
- Lastz Blast(z) variant. Has parameter settings tuned for 454 and Solexa mapping for both reference based assembly and variant discovery.
- EMBOSS tools for NGS
- LAST
many mores.... but never used.
Assemble tools ¶
- GSAssembly
- Performs whole genome shotgun assembly of genomes with or without paired-end data.
- Order contigs into scaffolds using supported paired-end reads.
- Assemble larger, more complex genomes up to 400 megabases in size with a 64-bit assembler or up to 20 megabases with a 32-bit assembler.
- Co-assemble with Sanger Sequencing reads.
- Select the read files you want to assemble by browsing in the GUI.
- Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (alignment of the reads to contig sequence)
- Lasergene Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds)
- Velvet
- EMBOSS tools for NGS
many mores.... but never used.
Platforms ¶
Notes ¶
- Every new software must handle this "huge" amount of data.
- Some veriations SNP, deletion, insertion can by identified during mapping/assembly process, other things more "new biology" needs more work by bionformaticians.
- Support from biologists is required in order to discover more new things.
TODOs ¶
Connected to SatelliteBigData a LIMS infrastructure is suggested for tracking history and infer problems coming from wet-lab.
From SatelliteVisualization these tools could be very usefull to explore at any level all the informations collected, from the alignment (lower) up to annotation/comparison (higher). The informations are intended as local (faltfile/databases/webservices) or remote (databases/webservices).
Editing in real time, providing alternatives for hypothesis.
Easy tools to integrate informations coming from different platforms (illumina, 454, solid).