Version 23 (modified by shuichi, 15 years ago)

--

Satellite meeting for Use-cases development and documentation

Topics

Participants

  • Toshiaki Katayama
  • Young Joo Kim
  • Keun-Joon Park
  • Yunsun Nam
  • Arek Kasprzyk
  • Syed Haider
  • Shuichi Kawashima
  • Takeshi Kawashima
  • Raoul JP Bonnal
  • Tatsuya Nishizawa
  • Oswaldo Trelles
  • José M. Fernández
  • Paul Gordon
  • Vachiranee Limviphuvadh
  • Tobias Gattermayer
  • Riu Yamashita
  • Fumikazu Konishi

Targets

Date

  • 2009/3/19 15:00-18:00
  • anytime

Room

  • 3F

Notes

Data for testing

1) SNPs data contain 262,338 SNPs (Stroke patient vs. normal) from Affymetrix GeneChip? (provided by Prof.Kim)
2) Amino acid sequences of 53 genes (multifasta format) which are located at disease map locus of one form of epilepsy (provided by Vachiranee)
3) Nucleotide sequences (multifasta format) (provided by Riu)
4) Genome data (provided by Takeshi)

Softwares

1) BioMart
2) Galaxy
3) jORCA
4) ANNOTATOR

Results

Question - Takeshi asked about how to annoate in-house data with comparing with public genome data using BioMart. - For example, how to annotate Halocynthia roretzi or Molgula tectiformis ESTs comparing with Ciona intestinalis and Ciona savigni using BioMart. Answer - Convert Halocynthia data into BioMart format. Install BioMart server locally. Then merge public DB and local data together. - BioMart format is simple.

Takeshi present an example - minor animal (ex, H.roretzi, closest animal of Ciona) - how can analyze Halocynthia roretzi and M.tectiformis which are rare annotation. - type of data is ESTs, assembled EST cluster,etc - potential flow: BioMart -> ANNOTATOR -> TogoDB

*SNPs data

ANNOTATOR currently can not analyse SNPs data itself but can analyse genes data which contain SNPs of interested.
Galaxy ( http://galaxyproject.org) and RGenetics ( http://rgenetics.org/)

quality control, ancestry, case-control analysis, tdt, oter statistical tests

-can link dbSNPs to Galaxy for further analysis

*Multifasta format (amino acid sequences)

ANNOTATOR can upload multifasta format of amino acid sequences -> Prim-seq-an algorithm
BioMart can upload GeneID but not the sequences to retrieve information associated to the GeneID
jORCA provide list of analysis which can do with FASTA format

*Multifasta format (nucleotide sequences)

ANNOTATOR can upload multifasta format of nucleotide sequences -> Prim-seq-an algorithm

*Genome data

DL ESTs of Halocynthia roretzi from NCBI/Taxonomy

TogoDB by Toshiaki - upload table format (table services for uploading data will be able to access using WS)

*Other comments from Developper side

jORCA by Oswaldo

-mapped WABI services - jORCA: WABI WSDL application can run in local machine after installation - jORCA can tell which kind of analysis can do with our multifasta format file or anykind of format. For example, using Magallanes: INB*, if put "FASTA" in Find box, the result come up with 23 tools that can use for analysis. For analysis use myexperiment.org *Magallanes: INB (Services discovering that discover what kind of analysis can provide for your data format)

TODOs

Requests from use cases to developper

*(TO: IntAct?, Cytoscape)
It would be nice if user can retrive PPIs which are expressed in any tissues by using option function. (From: Vachiranee)

*(TO: ANNOTATOR)
In the conversation with mostly biology-oriented participants of the Hackathon,
we found the following requests for ANNOTATOR to suit their needs:

- Ability to do batch jobs, i.e. large number of proteins annotated by our software (request came from many)
- ANNOTATOR should be downloadable and deployable somewhere local
- We were offered a potential collaboration in terms of using computing power of a Titec TSUBAME supercomputer. This means the ANNOTATOR's jobs need a mechanism to be submitted to an external site.
- For better results, consistency and the ability to replicate the annotator pipeline in a remote location, we need periodic and automatic updation of the underlying databases of the algorithms.
- How differences between ANNOTATOR and InterProscan?
- Ability to save results of analysis as xml format for further analysis

Attachments