= Satellite meeting for Use-cases development and documentation = == Topics == Several examples of the biological data are provided by users. The service developers explained how to use their programs for the data. Finally, users described the several potential work-flows for the computational part of their research. == Participants == * Toshiaki Katayama * Young Joo Kim * Keun-Joon Park * Yunsun Nam * Arek Kasprzyk * Syed Haider * Shuichi Kawashima * Takeshi Kawashima * Raoul JP Bonnal * Tatsuya Nishizawa * Oswaldo Trelles * José M. Fernández * Paul Gordon * Vachiranee Limviphuvadh * Tobias Gattermayer * Riu Yamashita * Fumikazu Konishi * Takatomo Fujisawa == Targets == == Date == * 2009/3/19 15:00-18:00 * anytime == Room == * 3F == Notes == '''Data for testing''' 1) SNPs data contain 262,338 SNPs (Stroke patient vs. normal) from Affymetrix GeneChip (provided by Prof.Kim)[[BR]] 2) Amino acid sequences of 53 genes (multifasta format) which are located at disease map locus of one form of epilepsy (provided by Vachiranee)[[BR]] 3) Nucleotide sequences (multifasta format) (provided by Riu)[[BR]] 4) Genome data (provided by Takeshi)[[BR]] '''Softwares''' 1) BioMart[[BR]] 2) Galaxy[[BR]] 3) jORCA[[BR]] 4) ANNOTATOR[[BR]] == Results == Question - Takeshi asked about how to annoate in-house data with comparing with public genome data using BioMart. - For example, how to annotate Halocynthia roretzi or Molgula tectiformis ESTs comparing with Ciona intestinalis and Ciona savigni using BioMart. Answer - Convert Halocynthia data into BioMart format. Install BioMart server locally. Then merge public DB and local data together. - BioMart format is simple. Takeshi present an example - minor animal (ex, H.roretzi, closest animal of Ciona) - how can analyze Halocynthia roretzi and M.tectiformis which are rare annotation. - type of data is ESTs, assembled EST cluster,,,etc,, - potential flow: BioMart -> ANNOTATOR -> TogoDB [[Image(bh2009-usecase_1.pdf)]] '''*SNPs data'''[[BR]] ANNOTATOR currently can not analyse SNPs data itself but can analyse genes data which contain SNPs of interested. [[BR]] Galaxy (http://galaxyproject.org) and RGenetics (http://rgenetics.org/) quality control, ancestry, case-control analysis, tdt, oter statistical tests[[BR]] -can link dbSNPs to Galaxy for further analysis [[Image(bh2009-usecase_2.pdf)]] '''*Multifasta format (amino acid sequences)'''[[BR]] ANNOTATOR can upload multifasta format of amino acid sequences -> Prim-seq-an algorithm[[BR]] BioMart can upload GeneID but not the sequences to retrieve information associated to the GeneID[[BR]] jORCA provide list of analysis which can do with FASTA format[[BR]] '''*Multifasta format (nucleotide sequences)''' [[BR]] ANNOTATOR can upload multifasta format of nucleotide sequences -> Prim-seq-an algorithm[[BR]] '''*Genome data'''[[BR]] DL ESTs of Halocynthia roretzi from NCBI/Taxonomy TogoDB by Toshiaki - upload table format (table services for uploading data will be able to access using WS) '''*Other comments from Developper side'''[[BR]] jORCA by Oswaldo -mapped WABI services - jORCA: WABI WSDL application can run in local machine after installation - jORCA can tell which kind of analysis can do with our multifasta format file or anykind of format. For example, using Magallanes: INB*, if put "FASTA" in Find box, the result come up with 23 tools that can use for analysis. For analysis use myexperiment.org *Magallanes: INB (Services discovering that discover what kind of analysis can provide for your data format) == TODOs == 1) How to combine the inhouse data into Public BioMart? => KAAS, blast2GO etc.[[BR]] 2) How easy to install the BioMart in local?[[BR]] 3) How modify the design of the interface of TogoDB?[[BR]] 4) User didn't have their own Data Repository site. (for TogoDB)[[BR]] == Requests from use cases to developper == *(TO: [http://www.ebi.ac.uk/intact IntAct], Cytoscape) [[BR]] It would be nice if user can retrive PPIs which are expressed in any tissues by using option function. (From: Vachiranee)[[BR]] At this moment the only way that Bruno@!IntAct knows of doing this is to get the list of proteins from a specific tissue (using [http://www.ebi.ac.uk/pride PRIDE]) and then using the list of proteins accessions to find PPIs in the molecular interaction databases, such as [http://www.ebi.ac.uk/intact IntAct]. If there is interest, we (!IntAct) could find a way to include this option in a future release. *(TO: ANNOTATOR)[[BR]] In the conversation with mostly biology-oriented participants of the Hackathon,[[BR]] we found the following requests for ANNOTATOR to suit their needs:[[BR]] - Ability to do batch jobs, i.e. large number of proteins annotated by our software (request came from many)[[BR]] - ANNOTATOR should be downloadable and deployable somewhere local[[BR]] - We were offered a potential collaboration in terms of using computing power of a Titec TSUBAME supercomputer. This means the ANNOTATOR's jobs need a mechanism to be submitted to an external site.[[BR]] - For better results, consistency and the ability to replicate the annotator pipeline in a remote location, we need periodic and automatic updation of the underlying databases of the algorithms.[[BR]] - How differences between ANNOTATOR and InterProscan[[BR]] - Ability to save results of analysis as xml format for further analysis[[BR]] *(TO: Galaxy)[[BR]] - I'd like to input the list of ID on galaxy (such as Gene ID, IPR No.) to text form of biomart filter.