This is work in progress..
While it may probably not be exhaustive, my intention is to provide a quick reference to BASE systems (Basically Available, Soft State, Eventually consistent, as opposed to ACID: Atomicity, Consistency, Isolation, Durability) that would offer newcomers an overview of the existing projects in the field.
So far, I've been looking for filling in information about the following characteristics:
- Data model
- Partitioning
- Persistence
- Rebalancing (elasticity)
- Replication (clustering)
I have also included notes about the implementation language and the protocols that can be used with each solution.
If you think I should include other criteria please do let me know.
The projects included so far in the list: Cassandra, CloudBase, CouchDB, Dynomite, HBase, Hypertable, Kai, LightCloud, LucidDB, Memcached, MemcacheDB, MonetDB, MongoDB, Neptune, Redis, Ringo, Scalaris, ThruDB, Tokyo Cabinet + Tyrant, Voldermort.
Alternative Data Storages
Project | Data model | Partitioning | Persistence | Rebalancing | Replication |
---|---|---|---|---|---|
Cassandra | Column-family (BigTable[5], Dynamo6) | Y[n4] | disk | Y | Y |
CloudBase | HDFS/Hadoop[n3] | Y | disk | Y | Y |
CouchDB | Doc-oriented | ?[n2] | disk | ?[n2] | ?[n2] |
Dynomite | Blob (Dynamo6) | Y | pluggable | Y | Y |
HBase | Column-family (BigTable[5]) | Y | disk | Y | Y |
Hypertable | Column-family (BigTable[5]) | Y | DFS (HDFS) | ? | Y |
Kai | Blob | ? | disk | ? | ? |
LightCloud | check Tokyo Tyrant[n5] | ||||
LucidDB | Column-based | ? | disk | ? | N |
Memcached[n1] | Blob | Y | RAM | Y | N |
MemcacheDB | Blob | ? | BerkleyDB | ? | Y |
MonetDB | |||||
MongoDB | Doc-oriented | Y | Y | ||
Neptune | |||||
Redis | |||||
Ringo | Blob | Y | disk | Y | Y |
Scalaris | Blob | Y | RAM | Y | |
ThruDB | Doc-oriented | ||||
Tokyo Cabinet + Tyrant | |||||
Voldemort | Structured / Blob / Text | Y | pluggable | N | Y |
Notes
- [n1] Memcached: a distributed memory object caching system
- [n2] CouchDB partitioning and replication: according to a 2009 Summer of code proposal:
While distributed deployments have been achieved with the help of proxies and smart external scripting, the core of CouchDB itself does not currently support distributing the database across multiple machines
. More references about CouchDB cluster: - [n3] All other criteria for CloudBase have been deduced based on the HDFS/Hadoop capabilities
- [n4] Cassandra: Consistent hashing vs order-preserving partitioning in distributed databases
- [n5] LightCloud seems to be a set of management scripts (Python) for Tokyo Tyrant
Implementation details
Project | Impl. | Client protocol | Refs |
---|---|---|---|
Cassandra | Java | Thrift[4] | [1], [2], [3] |
CloudBase | Java | JDBC (Java) | |
CouchDB | Erlang | HTTP + JSON | [1], [2], [3] |
Dynomite | Erlang | Thrift[4] | [1], [3] |
HBase | Java | ||
Hypertable | C++ | C++ API, Thrift[4] | |
Kai | Erlang | ||
LightCloud | Python + Tokyo Tyrant | Python | |
LucidDB | Java/C++ | JDBC (Java) | |
Memcached | C | all* | |
MemcacheDB | C | all* (memcached protocol) | |
MonetDB | C | ||
MongoDB | C++ | API (Python, Java, Ruby, PHP, C++, Perl, Erlang) | |
Neptune | Java | ||
Redis | C | ||
Ringo | Erlang | HTTP | |
Scalaris | Erlang | ||
ThruDB | C | ||
Tokyo Cabinet + Tyrant | C | C, Perl, Ruby, Java, Lua | |
Voldemort | Java | Java |
Performance
I usually do not trust micro-benchmarks. I know that performance measuring is an art. But I also know that some are looking for this sort of data and sometimes even the smallest piece of information is more helpful than nothing.
Project | reads/s | writes/s | refs |
---|---|---|---|
Cassandra | |||
CloudBase | |||
CouchDB | |||
Dynomite | |||
HBase | |||
Hypertable | |||
Kai | |||
LightCloud | See: Tokyo Tyrant results + this | ||
LucidDB | |||
Memcached | here, 2007, here | ||
MemcacheDB | benchmark data | ||
MonetDB | |||
MongoDB | Performance testing | ||
Neptune | |||
Redis | |||
Ringo | |||
Scalaris | |||
ThruDB | |||
Tokyo Cabinet + Tyrant | |||
Voldemort |
Other projects
I have found a couple of other projects, but I couldn't decide if they fit in or not. In case you consider that I should include them please do let me know (a helpful argument is also highly appreciated)
I'd like to also mention the FriendFeed usage of MySQL, which while not being a new system in itself it was conceived to behave like a BASE .
17 comments:
Nice post. A pity that so many columns are empty (all in the perf. table, ~ 1/2 in the 1st table).
jakubholy,
I am also concerned about the performance table as it looks like there isn't much information out there. I have found some and I'll start filling it in immediately.
Anyways, I think the only option to get enough details would be to spread the link and hope that others will start sharing their information.
Note that for Tokyo Cabinet/Tyrant you can also use the memcache protocol but you don't get access to everything.
For performance see this post: http://anyall.org/blog/2009/04/performance-comparison-keyvalue-stores-for-language-model-counts/ but I don't think any benchmark can be relevant.
For me more important are facts like Redis stores everything in memory, saving snapshots to disk so your DB cannot exceed your memory size.
Hi Alex,
Can you update CloudBase website link in your post- http://cloudbase.sourceforge.net
Thanks,
T
MongoDB: persistence: disk, rebalancing: Y
Other good potential columns would be:
Secondary indexes?
Sorting?
Anonymous: done.
dm:
1. can you point me to where MongoDB rebalancing is mentioned?
2. I'd be glad to add these new criteria, but can you please give more details about them?
1. the docs are incomplete at the moment - i will post something when they are updated.
2. what i was thinking is that some datastores are pure "key/value stores" where you can *only* query on the primary key. memcached is a good example of that. Some of the other databases let you query on any field or combination of fields. This is important for some use cases and would be good for folks to know which capabilities are in a given tool.
By secondary indexes, I mean one can create a DB index on a field that is not the primary key. This makes the "non primary key query" above fast.
Sorting is pretty clear -- a case where it is really helpful for the database to do the sorting is when it already has a btree in that order (then it is very fast). Also, for an "ORDER BY ... LIMIT ..." style operation, client sorting isn't efficient.
So i'll revise my previous comment and say good potential columns would be:
- query on non-primary key fields?
- secondary indexes?
- sorting?
Great list - thanks a bunch.
We are looking at a bunch of these technologies. They differ greatly in maturity, ability to deal with different sizes of data, failures, etc.
Our case: we have 180 million keys (32-char in length). The values average 100 chars in length. I would say that reads are 100 more frequent than writes. Which of the distributed key/value hash system would you guys recommend? We care about redundancy, performance and ease of administration (who doesn't :). We run our platform on Java. Thanks.
A "License" column would be helpful. Great resource, thanks!
Neptune
- Data Model: Bigtable
- Partitioning: Y
- Persistence: DFS(HDFS)
- Rebalancing: Y
- Replication: Y
- Impl: Java
- Client Protocol: Java, RESTFul, Thrift
- Preformance: http://www.jaso.co.kr/neptune/performance.html
hadoop support? (or any other kind of mapreduce)
I think you are missing a very important category of NoSQL database .... the object database. Versant, was the sponsor of the NoSQL meetup in Berlin. Pretty interesting stuff for NoSQL when dealing with complex models and requiring transactions and large scale distribution. Handles C, C++, Java, C#, Python. In the Berlin presentation it showed multi-terrabyte systems for folks like European Space agency, plus there was some stuff on high throughput txns...running a couple of the GDS systems for the airlines.
Why db4o is not in the list? I'd like to see how it compares to the rest.
Add Midgard2 to the list:
http://www.midgard-project.org/midgard2/
Written in C, API for most languages, P2P replication and data sharing support, Object Oriented storage (defined by XML schemas)
MonetDB
- Data Model: column-store
- Partitioning: Y
- Persistence: disk
- Replication: available
- Client protocol: All
- Performance: e.g. http://www.cwi.nl/~mk/ontimeReport
I'm rather new to this technology so this comment might not make much sense but should Katta be on this list?
Post a Comment