Satellite meeting for Sequence analysis tools from the next generation sequencer


To share our knowledges about sequence analysis tools. Please write what you know about the tools.

Also see  Bioinformatics for Next Generation Sequencing virtual issue -- yaskaz@ddbj (thanks to ichan) and  Table 3 of Jay Shendure & Hanlee Ji., Next-generation DNA sequencing, Nature Biotech. (thanks to hinaichigo)

Mapping tools

  • Maq
  •  BOWTIE Ultra-short reads, Burrows-Wheeler index.
  •  Lasergene Support 454/Solexa. GUI interface for displaying alignments and contigs organization (scaffolds)
  •  GSMapper
    • Map reads to any reference genome and generate a consensus sequence.
    • Easily view all differences compared to the reference sequence with automatic output to separate files: Insertions (blocks up to 50 bases), Deletions (blocks up to 50 bases), SNPs
    • Quickly identify high confidence difference compared to the reference genome, which are singled out in a separate file.
    • Compare large, complex genomes of any size including: resequencing of whole genomes from humans, plants, yeasts, bacteria, fungi, viruses, YACs, BACs, fosmids
    • Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (consensus alignment of the reads against a given reference sequence)
  •  SHRiMP supports letter-space (454/Solexa), color-space (Solid) and Helicos 2-pass space. Spaced-seeds followed by S-W.
  •  Lastz Blast(z) variant. Has parameter settings tuned for 454 and Solexa mapping for both reference based assembly and variant discovery.
  •  EMBOSS tools for NGS
  •  LAST

many mores.... but never used.

Assemble tools

  •  GSAssembly
    • Performs whole genome shotgun assembly of genomes with or without paired-end data.
    • Order contigs into scaffolds using supported paired-end reads.
    • Assemble larger, more complex genomes up to 400 megabases in size with a 64-bit assembler or up to 20 megabases with a 32-bit assembler.
    • Co-assemble with Sanger Sequencing reads.
    • Select the read files you want to assemble by browsing in the GUI.
    • Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (alignment of the reads to contig sequence)

many mores.... but never used.



  • Every new software must handle this "huge" amount of data.
  • Some veriations SNP, deletion, insertion can by identified during mapping/assembly process, other things more "new biology" needs more work by bionformaticians.
  • Support from biologists is required in order to discover more new things.


Connected to SatelliteBigData a LIMS infrastructure is suggested for tracking history and infer problems coming from wet-lab.

From SatelliteVisualization these tools could be very usefull to explore at any level all the informations collected, from the alignment (lower) up to annotation/comparison (higher). The informations are intended as local (faltfile/databases/webservices) or remote (databases/webservices).

Editing in real time, providing alternatives for hypothesis.

Easy tools to integrate informations coming from different platforms (illumina, 454, solid).