Version 14 (modified by ngoto, 16 years ago)

--

Satellite meeting for big datasets

Topics

As more and more projects are spewing out datasets we might have to look at new paradigms of working with them. The conventional way of storing everything in a database and querying with SQL has become challenging or just impossible. Data are now more often stored in their original data formats.

The object of this meeting is to exchange ideas on this topic and discuss possible solutions or practices.

Topics:

  • Next-gen sequencing
  • Protein-interacting networks?
  • Cloud storage/compute?

Chairperson

  • Jan Aerts

Date

  • 18th March, 2009

Room

  • Meeting Room (1F)

Attendees

  • Jan Aerts
  • José María Fernández
  • Todd Harris
  • Yunsun Nam
  • Pierre Lindenbaum
  • Keun-Joon Park
  • Raoul Jean Pierre Bonnal
  • Yasukazu "yaskaz" Nakamura (yaskaz@…)

Notes

Data storage will not be a problem (if you have enogh money). But downloading and/or manipulating Big sequence data must be problem for both provider (us=ddbj) and user. Is there a possible solution? And I have a big question: Do you really need and/or use "Short Read Archive?" -- yaskaz

Results

TODOs

  • Design tests for BioSQL on different scenario related to BigData? BigD
    • Can BioSQL be used for short reads?
  • Test BioSQL performances on BigD

Attachments