Apache-Hadoop

This cheat sheet presents a basic blueprint for applying MapReduce to solving large-scale, unstructured data processing problems by showing how to deploy and use an Apache Hadoop computational cluster. http://bit.ly/StartedApache

This is a complement of Dzone Refcardz #43 and #103, which provides introductions to high performance computational scalability and high-volume data handling techniques, including MapReduce. Download Here: http://opensourceuniverse.tradepub.com/free/w_dzon04/prgm.cgi

Also pick up the Apache HBase: The NoSQL Database for Hadoop and Big Data cheat sheet here: http://opensourceuniverse.tradepub.com/free/w_dzon07/prgm.cgi

ArangoDB

https://www.arangodb.com/2016/06/arangodb-3-0-a-solid-ground-to-scale/
https://github.com/shekhargulati/52-technologies-in-2016/blob/master/13-arangodb/README.md
Arango vs. MongoDB: https://www.arangodb.com/tutorials/mongodb-to-arangodb-tutorial/
CheatSheet, http://www.arangodb.org/2012/08/05/arangodb-shell-cheat-sheet
Try Arango online, http://www.arangodb.org/try

CaleyDB

CaleyDB for RDF data, https://news.ycombinator.com/item?id=7946024
https://github.com/google/cayley
https://johngoodwin225.wordpress.com/2014/06/29/quick-play-with-cayley-graph-db-and-ordnance-survey-linked-data/

MongoDB

Timeseries in Mongo: https://dev.to/riccardo_cardin/implementing-time-series-in-mongodb
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

Neo4j

Neo4j is a robust, high performance, scalable graph NOSQL database solving the complex, connected data challenges that enterprises face today: http://neo4j.org/

OpenCog

An open source artificial intelligence framework with a graph database that holds terms, atomic formulas, sentences and relationships as hypergraphs; giving them a probabilistic truth-value interpretation, dubbed the AtomSpace.

Redis

Working with large datasets, http://news.ycombinator.com/item?id=3614706 and http://www.bigfastblog.com/how-to-get-experience-working-with-large-datasets

SciDB

The array DB.

http://www.datanami.com/2014/04/09/array_databases:_the_next_big_thing_in_data_analytics_/
http://www.forbes.com/sites/petercohan/2014/02/07/paradigm4-the-next-big-thing-in-big-data/
http://www.odbms.org/blog/2014/04/interview-mike-stonebraker-paul-brown/
HDF5 to SciDB, https://groups.google.com/forum/#!topic/pydata/S3kLxyrizkI
HDF Vs. Pytables, http://stackoverflow.com/questions/7883646/exporting-from-importing-to-numpy-scipy-in-sqlite-and-hdf5-formats/7891137#7891137
http://semanticommunity.info/AOL_Government/Data_Science_Visualizations_Past_Present_and_Future
Check out scidb http://www.scidb.org/, a no-sql database specifically for scientific purposes, developed by Stonebraker and others. Here's a paper on a demonstration of how to use it http://people.csail.mit.edu/pcm/papers/SciDB_Demo.pdf
Data Duplication, Server Redundancy, and Master Failover, http://www.scidb.org/forum/viewtopic.php?f=6&t=1068
https://github.com/jmeehan16/whitematter/blob/master/read_me.txt

https://github.com/Paradigm4/SciDB-Py/issues/49
From, http://escience.washington.edu/get-help-now/get-started-scidb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db-nosql.md

db-nosql.md

Apache-Hadoop

ArangoDB

CaleyDB

MongoDB

Neo4j

OpenCog

Redis

SciDB

Files

db-nosql.md

Latest commit

History

db-nosql.md

File metadata and controls

Apache-Hadoop

ArangoDB

CaleyDB

MongoDB

Neo4j

OpenCog

Redis

SciDB