Changes between Version 42 and Version 43 of SatelliteBigData

Show
Ignore:
Timestamp:
2009/03/20 14:11:18 (15 years ago)
Author:
severin
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SatelliteBigData

    v42 v43  
    8383}}} 
    8484 
     85Currently the available public resources like SRA, GEO, ArrayExpress are only providing query facilities on the metadata of the experiments surrounding the data.  The data is available as files to download (often in the original format) but they do not provide facilities to externally explore the data and ask biological questions on the data.  This then forces anyone who wants to explore the dataset to download this data into local integration systems before they can ask their biological questions.  
     86 
     87Working with existing big data 
     88 - SRA, GEO, ArrayExpress: today they just provide the metadata of the dataset, not an ability to explore the actual data  
     89 - now most of us pull the whole thing down and then work with it 
     90 - sometimes it is even hard to send for submission, sometimes DVD to move it around. not optimal. 
     91 - what are the queries we want to do? 
     92   - what are the reads in this region, definitely  
     93 - but also SRA maybe does not want do everything, but they will not turn data down and want everyone to send them the published data 
     94 - maybe not all data will end up in ONE archive (because it is so big). maybe need to query multiple centers to find all data (DDBJ, SRA, GEO, ArrayExpress, korea? China?) 
     95 
     96 
    8597==== Processing ==== 
    8698