Enterprise Semantics - The Cambridge Semantics Blog

« Back

WhySQL? Evernote and Boring Old Reliable Architecture

In my last blog post I argued that Semantic Web databases have the flexibility inherent in NoSQL systems plus the transactional semantics of a relational database systems, and I argued this was a major reason for their growing adoption by enterprises.

Hours later, Evernote’s blog had a post called “WhySQL?” outlining why they didn’t go with a NoSQL system.

For those of you who don’t know, Evernote is a hot web startup that’s been growing like crazy with 8-figure annual revenue in 2011.  However, unlike (seemingly) every other web startup in the world they are using a boring old SQL relational database (in their particular case it was MySQL PostgreSQL).  Not only that, but they’re not even using cloud hosting!  What is this, 1999, you ask?

I thought it was a fascinating case of a high traffic web company going with traditional technology for exactly the reason I called out in my last piece: relational databases are reliable and predictable.  You know what you get with their transactional guarantees.  In the article, Dave says (emphasis mine):

Each of these coarse-grained API calls is implemented through single SQL transaction, which ensures that a client can completely trust any reply given by the server. The ACID-compliant database ensures…[Atomicity, Consistency, Durability]…

It gets even more interesting.  Evernote holds tons of data, much of it multimedia.  The data is not very structured.  It serves (what I would guess) is a large amount of web traffic.  Wouldn’t this be a perfect place to employ a hot, whizbang NoSQL database that offers greater performance?

In fact, Dave even outlines why NoSQL databases have such a great appeal:

The ACID benefits of a transactional database make it very hard to scale out a data set beyond the confines of a single server. Database clustering and multi-master replication are scary dark arts, and key-value data stores provide a much simpler approach to scale a single storage pool out across commodity boxes.

Right before saying that they can avoid it altogether by using a clever partitioning scheme (emphasis mine):

Fortunately, this is a problem that Evernote doesn’t currently need to solve. Even though we have nearly a billion Notes and almost 2 billion Resource files within our servers, these aren’t actually a single big data set.  They’re cleanly partitioned into 20 million separate data sets, one per user.

That is, Dave & Evernote would rather deal with managing 20 million separate data sets than go with, what he admits, is a “much simpler approach to scale” (the NoSQL way) because SQL systems have a stronger transactional guarantee.

This is exactly what I was getting at in the previous post.  The advantages of NoSQL systems to date are wonderful, but the lack of strong transactional guarantees make them impossible for enterprises to use for storing mission critical information.

Full disclosure: I’m a huge fan of Evernote, and am an Evernote Premium subscriber.

Comments
Trackback URL:

Dave
Nice post, Rob, thanks for the feedback.
One clarification ... we're using MySQL, not PostgreSQL. This is really just due to the person who happened to set up our database first, since PostgreSQL is also a great open source database which would probably have worked just as well for us.

Dave Engberg
Evernote
Posted on 2/28/12 11:42 AM.
Thanks for the catch! Fixed.

cheers!
-Rob
Posted on 2/28/12 12:20 PM in reply to Dave.

Subscribe

Semantic Technology Links