Version 18 (modified by bonnalraoul, 15 years ago)

--

Satellite meeting for Sequence analysis tools from the next generation sequencer

Topics

To share our knowledges about sequence analysis tools. Please write what you know about the tools.

Also see  Bioinformatics for Next Generation Sequencing virtual issue -- yaskaz@ddbj (thanks to ichan) and  Table 3 of Jay Shendure & Hanlee Ji., Next-generation DNA sequencing, Nature Biotech. (thanks to hinaichigo)

Mapping tools

Maq
ELAND
 BOWTIE Ultra-short reads, Burrows-Wheeler index.
 Lasergene Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds)
 GSMapper

  • Map reads to any reference genome and generate a consensus sequence.
  • Easily view all differences compared to the reference sequence with automatic output to separate files: Insertions (blocks up to 50 bases), Deletions (blocks up to 50 bases), SNPs
  • Quickly identify high confidence difference compared to the reference genome, which are singled out in a separate file.
  • Compare large, complex genomes of any size including: resequencing of whole genomes from humans, plants, yeasts, bacteria, fungi, viruses, YACs, BACs, fosmids
  • Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (consensus alignment of the reads against a given reference sequence)


 SHRiMP supports letter-space (454/Solexa), color-space (Solid) and Helicos 2-pass space. Spaced-seeds followed by S-W.
 Lastz Blast(z) variant. Has parameter settings tuned for 454 and Solexa mapping for both reference based assembly and variant discovery.
EMBOSS

Assemble tools

 GSAssembly

  • Performs whole genome shotgun assembly of genomes with or without paired-end data.
  • Order contigs into scaffolds using supported paired-end reads.
  • Assemble larger, more complex genomes up to 400 megabases in size with a 64-bit assembler or up to 20 megabases with a 32-bit assembler.
  • Co-assemble with Sanger Sequencing reads.
  • Select the read files you want to assemble by browsing in the GUI.
  • Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (alignment of the reads to contig sequence)


Velvet
EMBOSS

Platforms

 Roche, 454
 Illumina, Solexa
 AB, Solid

Date

Room

Presentations

Notes

Every new software must handle this "huge" amount of data. Some veriations SNP, deletion, insertion can by identified during mapping/assembly process, other things more "new biology" needs more work by bionformaticians. Support from biologists is required in order to discover more new things.

Results

Ideas

Connected to SatelliteBigData a LIMS infrastructure is suggested for tracking history and infer problems coming from wet-lab. From SatelliteVisualization these tools could be very usefull to explore at any level all the informations collected, from the alignment (lower) up to annotation/comparison (higher). The informations are intended as local (faltfile/databases/webservices) or remote (databases/webservices). Editing in real time, providing alternatives for hypothesis.

TODOs

Easy tools to integrate informations coming from different platforms (illumina, 454, solid).