Did you ever hear a claim that sounded too bad to be true?
So this past Tuesday at Velocity 2010, Brett Piatt gave a workshop on the Cassandra database. I was seated in the audience and quite interested in everything he had to say about running Cassandra, given that I've been working on adding Cassandra and other scalable data stores to Ubuntu.
With a 50GB table
Actually it looked exactly like that, because it was copied from this page that is, as of this point, only available in its original form in google cache.
The page linked has *no* explanation of this table. Its basically just "OH DAAAAMN MySQL you got pwned". But seriously, WTF?
I asked Brett where those numbers came from, and whether we could run the tests ourselves to compare our write performance to Cassandra's I don't mean to say "our write performance" as in MySQL's, as this statement implies, but rather ours to the write performance of the Cassandra team's. Brett claimed ignorance and just referred to the URL of the architecture page.
Ok fair enough. I figured I should investigate more ,so I asked on #cassandra on freenode. People pointed me to various other slide decks with the same table in them, but none with any explanation.
At some point, somebody rightfully recognized that having these numbers with no plausible explanation is ridiculous, and removed them from the site. Another person did in fact rightfully recognize why this may be the case.
Basically with a 50G table, assuming small records, you will have *a giant* B-Tree for the primary key of that table (assuming you have one) will take 30+ disk seeks to update. That means that at 10ms (meaning, HORRIBLE) per seek, we'll take 300ms to write. This is contrasted to Cassandra which can just append, requiring at most one seek.
So anyway, Cassandra team, thanks for the explanation, and kudos for righting this problem. Unfortunately the misinformation tends to be viral, so I'm sure there are people out there who will forever believe that MySQL takes 300ms to update a 50G table.