Changes between Version 69 and Version 70 of SatelliteBigData

Show
Ignore:
Timestamp:
2009/03/20 15:13:49 (15 years ago)
Author:
severin
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SatelliteBigData

    v69 v70  
    9898   - what are the reads in this region, definitely  
    9999 - but also SRA maybe does not want do everything, but they will not turn data down and want everyone to send them the published data 
     100 - SRA is still evaluating technology to do region based access of short-reads 
     101 
    100102 
    101103What are the common ways we would want to query? 
     
    148150 
    149151 
    150 Data production centers 
    151  - RIKEN OSC-LSA [http://www.osc.riken.jp/] is producing lots of data, but this data must be managed, manipulated, and mined for biology before it can be published and released to the public.  EdgeExpressDB (eeDB) was developed during FANTOM4 project and is now being used for in-house big data management and visualization of big datasets.  eeDB is effectively an object-database which is implemented as an API and webservices. The system is currently being ported to C and file indexes, and based on the prototype code, we are expecting around a 20x-100x performance boost.  The current version of the eeDB API toolkit and webservices are written in perl with a narrow/deep mysql snowflake schema. This generation1 system of the API can manipulate short-read data for our internal research purposes and is proving to scale very well. eeDB works with node and network, sequence tag, mapping, and expression data at the level of billions of elements very easily.  Queries can access individual objects, edges, and work with streams or sets of objects queried by regions, node, or networks.  
    152  - SRA is still evaluating technology to do region based access of short-reads 
    153  
    154  
    155152 
    156153IN THE END the goal is to find biology. Having access to the individual data elements is critical and this can not be just locked away inside files that can not be internally accessed.