Version 1 (modified by akinjo, 14 years ago)


(The following is a draft of a part of a paper to be submitted.)

A DDBJ-KEGG-PDBj workflow: from pathways to protein-protein interactions

The objective of this working group is to examine the potentials and obstacles of web services by implementing a real-life use case. The goal of the workflow is to enumerate possible physical protein-protein interactions among proteins in a biochemical pathway. More specifically, the workflow proceeds as follows. (1) The user provide a KEGG pathway ID. (2) Extract the amino acid sequence of each enzyme in the specified pathway. (3) For each amino acid sequence, run BLAST search against Swiss-Prot database. (4) Construct a phylogenetic profile (a species-by-enzyme matrix) by identifying the top hits for each proteins and each species. (5) For each species in the phylogenetic profile, run BLAST searches for each amino acid sequence against PDB. (6) If two amino acid sequences (of the same species) have homologs in the same PDB entry, they are inferred to be in possible contact, and hence predicted to be an interacting pair.

To implement the workflow outlined above, we have used the SOAP and REST APIs of DDBJ ( http://www.ddbj.nig.ac.jp/), KEGG ( http://www.genome.jp/) and PDBj ( http://www.pdbj.org/).