Posterous
Ben is using Posterous to post everything online. Shouldn't you?
Avatar_thumb
 

Idiots Abound

There everywhere! They're everywhere!

23
Jan 2010

How'd that Mongo get so fast??????

So we are working on a new large site at work, and we are evaluating different database. To do some comparisons, I downloaded a few million documents from the web (thanks wikipediea ;)) and wrote a little testing suite. The basic take away is wholly crap Mongodb is fast.

So I setup MongoDb, CouchDb, Mysql and Memcached on a local VM. I opened a few hundred thousand documents and inserted them into the various dbs. I indexed the same fields on mysql and mongo, but since memcache and couch don't have indexing, they were just inserted. Next I proceeded to insert 100,000 documents. All I can say is wow. No really wow. Mongo is actually getting faster writes than memcached. That's just plain amazing. Tomorrow I will run the select test against this data.

+------------------------------------------------------------------------+
| 100000 Inserts |
+------------------------------------------------------------------------+
| | Average | Median | Deviation | Lap Total | Total |
+------------------------------------------------------------------------+
| MongoDb | 0.00011 | 7.0E-5 | 0.00012 | 10.70719 | 1712.42024 |
| MysqlDb | 0.00083 | 0.00041 | 0.00792 | 82.8752 | 1712.42737 |
| CouchDb | 0.0064 | 0.00462 | 0.05921 | 639.94759 | 1712.41974 |
| Memcached | 0.00279 | 1.0E-5 | 0.00996 | 278.66401 | 1712.39775 |
+------------------------------------------------------------------------+

 

Loading mentions Retweet
Jan 23, 2010
LDashAndroid said...
that is super fast but mysql isn't too bad either. I guess the document test will be difference maker.
Jan 24, 2010
janl said...
The reason MongoDB is so fast is likely that it doesn't actually write anything to disk while you are counting.

See http://jan.io/Gy8t and http://jan.io/SEq7 on common issues with one-off benchmarks.

Jan 24, 2010
Ben Brown said...
Yes and no. Across a small set of writes yes, but across 100000 writes there are actual records going to disk at the same time writes are coming in to the server. Mongo still does write and forget. At the same time, look at the memcached numbers. It never writes to disk but mongo is still fast in comparison.

Also you may want to look at the read numbers. They are just as impressive.


---
My spelling is perfect, my thumbs are not. Sent from my blackberry.

Jan 24, 2010
Ben Brown said...
jani -- I missed some of your comments the first time. I've read both those articles, but these weren't one off stats. I'm doing another set of tests with threaded requests, and I'll add some details for the test backgrounds.
Jan 24, 2010
mikeal said...
MongoDB isn't ACID compliant so I'm not sure what their rules are for write conditions.

MySQL and CouchDB are both ACID compliant, which means that they won't allow a response to return with success unless the write has actually finished and is guaranteed to be in the DB and cannot be lost if the DB falls over.

One good way to speed up writes, if you don't care about this transaction behavior, is to bulk all the writes you do per second together. So even if you get 400 write requests you're only actually doing one write and you're just automatically returning a success condition to every request.

I know that jchris was working on an optimization a long time ago that allowed you turn on a particular feature in CouchDB (via manual configuration) that would drop the transaction behavior to provide faster concurrent write conditions but I don't know if it ever made it in.

Jan 24, 2010
Ben Brown said...
MongoDb does support atomic updates at the document level. In my opinion, if I need full transactional datastore, I would use a RDBMS. Check out http://tinyurl.com/yfh8gog for more details on their atomic operations. MongoDb even says that right on their site.

I'm not evaluating it for a transactional based system. I'm looking to store a huge amount of analytical data as well as several documents and user details. For me the likelihood of concurrent edits is pretty low. I'm not sold on mongo, but so far I'm impressed with its speed in both reads and writes.

Jan 26, 2010
mikeal said...
I'm saying this not for a transactional vs. non-transactional argument.

You're comparing a bunch of databases that return a response when the write actually happens, when the transaction is finished, to database that returns true no matter what. You're basically just testing the socket response time.

You could throw billions of documents at MongoDB with the same response time and as long as you don't fill up all the resources on the box and crash the server it'll just spend the next hour actually doing the writes.

Jan 27, 2010
mdirolf said...
By default MongoDB forces a full fsync every minute. That is also configurable and there is a command to manually force an fsync. So as long as you do at least one blocking op (query, "safe" insert, etc.) on the socket after doing a series of inserts you are guaranteed that those writes will be on disk w/in a minute of when the inserts finished, not an hour.
Jan 27, 2010
mikeal said...
I actually wasn't referring to how long between the insert call and the fsync, I was saying that if you throw gigs of data at MongoDB concurrently when it does do the fsync it's going to take a while for the write to finish.
 
To leave a comment on this posterous, please login by clicking one of the following.
Posterous-login     Connect     twitter