37 | | '''Examples of Text mining approach''' |
38 | | * Related PNE Japanese articles. |
39 | | PubMed provides "Related Articles" for each MedLine abstract. DBCLS is developing a tool to provide "Related Protein, Nucleic acid and Enzyme (PNE) Japanese Articles" for all MedLine abstracts. |
40 | | In this approach, we have applied dictionaries to recognize biomedical terms and translate them to Japanese and have used GALAXY to show the results. |
41 | | |
42 | | * Prediction of Protein Sub-cellular Localization. |
43 | | [http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/home/wiki.cgi?page=GENIA+corpus/ GENIA corpus] is a collection of biomedical literature. It has been compiled and annotated within the scope of the [http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/home/wiki.cgi?page=GENIA+Project/ GENIA project]. The goal of the project is to develop text mining (TM) systems for the domain of molecular biology. The GENIA corpus has been developed to provide a reference material for the development of bio-TM systems. The corpus currently contains 1,999 Medline abstracts which were collected using the three MeSH terms, "human", "blood cells", and "transcription factors". The corpus has been annotated with various levels of linguistic and semantic information. |
44 | | As for the cellular components in GENIA ontology, Japan Biological Information Research Center(JBIRC) has constructed a new corpus that annotates protein subcellular locations and developed a Machine Learning-based prediction tool that can recognize subcellular locations for proteins. |