Changes between Version 56 and Version 57 of SatelliteBigData
- Timestamp:
- 2009/03/20 14:37:22 (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SatelliteBigData
v56 v57 62 62 Protein-protein interaction datasets typically consist of a relatively small number of small objects. This type of data requires no advanced storage systems or tweaks to common systems; a simple RDBMS will do. 63 63 Data like genome sequences and assembly data involve larger objects (1 genome sequence -> 3Gb), but are still easily manageable on a standard filesystem. 64 Really big objects such as the data from simulations [IS THIS CORRECT?] require specialized storage systems such as [http://o pensolaris.org/os/community/zfs/ ZFS], [http://wiki.lustre.org/index.php?title=Main_Page Lustre], [http://oss.sgi.com/projects/xfs/ XFS],[http://www.pvfs.org/ PVFS2] or future Linux filesystem [http://btrfs.wiki.kernel.org/index.php/Main_Page Brtfs].64 Really big objects such as the data from simulations [IS THIS CORRECT?] require specialized storage systems such as [http://oss.sgi.com/projects/xfs/ XFS],[http://opensolaris.org/os/community/zfs/ ZFS], [http://wiki.lustre.org/index.php?title=Main_Page Lustre], [http://www.pvfs.org/ PVFS2] or future Linux filesystem [http://btrfs.wiki.kernel.org/index.php/Main_Page Brtfs]. 65 65 66 66 In contrast to the above, diffraction results, microarray results or next-gen sequencing reads involve a largish number of objects which become more difficult to query. They are typically still stored in RDBMS but might require some tweaking that digresses from a normalized relational database model, for example databases based on a key/value model (e.g. [http://www.oracle.com/technology/products/berkeley-db/index.html BerkeleyDB], [http://tokyocabinet.sourceforge.net/index.html Tokyo Cabinet], BigTable, [http://hadoop.apache.org/core/ Hadoop] ).