RavenDB vs MSSQL

I just stumbled upon this question on stackoverlow. I began typing my answer and when I was finished I realized that it was pretty long and it might be a good idea to do this in a blog post. So here is what I answer when I am asked whether to choose RavenDB or MSSQL as the storage technology for your greenfield application.

Disclaimer: before accusing me of being biased towards RavenDB, please note that we’ve just started two projects where we had chosen SQL server as our database. Obviously I love RavenDB, but it’s definitely not a one-for-all thing.

Performance

RavenDB is said to be very fast and you often hear that it’s faster than sql server. That’s not true. It is not RavenDB itself that is faster as a database, instead it’s the application built on top of RavenDB that is faster most of the times. There are a few reasons for that:

  • Reduced amount of database requests
    RavenDB enables you to store whole object graphs and collections inside a single document. That’s why you generally need much fewer requests when using RavenDB compared to a relational database.
  • Less joins when querying
    Although RavenDB has some concepts to support joining between documents, a document database embraces the use of denormalization. So instead of picking all the necessary information for one query together from different documents, you probably already have everything inside one single document. RavenDB is very fast in loading single documents so that’s the reason for the performance boost here.
  • Indexes: precomputed aggregations and calculations
    Map/reduce indexes can be used to do intensive calculation in the background (and also distributed) and store the calculated results on the disk, so that querying those is cheap because all the calculations have already been done. The same goes for normal map indexes as well. RavenDB can do calculations and store the results inside indexes for fast querying. You should note however, that indexes have BASE consistency, that means, they can be stale and need to catch up latest write operation on the database. While that means that in busy systems you don’t have a guarantee that you get most up-to-date results from an index, this is also a big selling point on the other hand, because it means, that write operations are not affected by the number of indexes. They simple complete and RavenDB updates the indexes in the background without slowing down insert/update perfomance.

Simplicity

Although RavenDB is putting a lot of efforts into making your experience as easy as possible, there’s still a steep learning curve related to the mindshift in data modeling. You will need to think about your data in a different way than you probably had been doing before when designing relational databases. This is something you definitely want to think about before choosing RavenDB as the storage technology for a new application with a tight development deadline.

However, learning document modeling and working with RavenDB is a rewarding experience because you get immediate feedback and see how much easier and faster you can be when you’re finished.

Although we have EF4.1 code-first, FNH with automappings and even micro-orms like Massive or Petapoco that do a very good job in simplifiying our development efforts when working with sql server, it’s still far easier to work with RavenDB (please note, this is my personal opinion here). You just don’t have to think about mappings and normalization. Just put your objects in and you’re done.

Tooling and ecosystem

This is something you will definitely miss if you’re coming from sql server and are used to tons of third-party software that helps you work with and manage your sql servers. RavenDB has made a few big improvements in this regard recently and I expect a massive improvement with the 1.2 release, however it’s still far beyond what you get when using sql server. If you don’t feel comfortable with configuration files and are more the wizard-magic guy then you probably don’t want to use RavenDB (right now).

Another pain point in working with RavenDB (and NoSQL databases in general) is that it’s nearly impossible to get good ad-hoc reporting support. RavenDB has a bundle that let’s you replicate data out to sql server for reporting purposes but obviously that’s much more friction than just querying on the datasets in the first place.

Which to choose?

The question which database to choose obviously depend on your concrete scenario, the skills of your team, your enviroment (existing licenses), etc. but here is what I think could help you:

We choose RavenDB when…

  • we can think of our data in terms of aggregates with mostly independent chunks of data (e.g. customer, order, product, etc.)
  • we need to have good performance on aggregation and calculation queries
  • we need to have complex searching (full-text, facets, etc.)
  • we need to be able to scale
  • we need high availability at low costs

We choose SQL Server when…

  • when we need to support user generated reports and highly dynamical data analysis
  • we have to deal with mostly relation data (e.g. accounting, statistics)
  • we want to use Windows Azure
  • our customer definitely wants us to choose sql server without knowing better

I hope these arguments help you choose one or the other storage technology for your application…

But wait! You don’t have to choose one or the other database. Actually you can mix them and pick the advantages of both databases. So here is the deal: if you feel your application is complex enough or it would benefit from one or the other database, just use them both.

,

10 Responses to RavenDB vs MSSQL

  1. Stuart Clark May 9, 2012 at 11:06 pm #

    Great Summary!

  2. Carlos May 13, 2012 at 11:39 pm #

    Can’t we run RavenDB in Azure?

    • Daniel Lang May 13, 2012 at 11:48 pm #

      Oh, we can run RavenDB in Azure. You can use a worker role for that and have a persistent drive for the Data directory. You could also run it embedded in a single web role and also have the Data directory on a persistent drive. However, in my opinion both solutions suck. They miss the whole point about Azure. If I’m using Windows Azure, then I want to either elasticity in terms of auto scaling and if this doesn’t work well (always hard for databases) then I want to have an easy (!) way to add and remove instances from the database cluster. I’m not aware of a technique that lets me do this with RavenDB in Windows Azure (right now). Having none of these core advantages, you can run your RavenDB instances much cheaper on regular virtual machines or dedicated servers.

  3. john doe May 14, 2012 at 4:29 am #

    Hi,

    Thanks for a good write up.
    I am coming from SQL-heavy background and had always been wondering about NoSQL.
    I am drawn to your comment that ‘customer, order, product’ are independent and thus can be modelled in NoSQL ?

    Could you please elaborate that ?
    I mean, how do you remodel customer, order and product to fit (pigeon-hole?) NoSQL ?
    I’ve always thought those 3 would be relational:
    1 customer have many orders
    1 order have many products

    … and to add a bit more:
    many products have many suppliers

    How would you remodel this ?
    I don’t mind a JSON representation of how this can be achieved.

    Much appreciated for your time.

    • Daniel Lang May 15, 2012 at 1:00 pm #

      The example of Customer, Order and Product works just perfectly in a document database, especially in RavenDB.
      They are independent in that they can exist on their own. Of course, there are references between them, but references and relations is nothing that stops you from using RavenDB. Please refer to my other posts about RavenDB to see which options you have to work with relations in RavenDB.

      Just take a moment and think about your example. Of course, you have a relation from Order to Customer but you _do_ want to have that relation denormalized. You don’t want the address of an order to change after the customer has resettled, do you? Also, if the name or the price of a product changes after an order has been created, you don’t want to reflect these changes on the order, right? These are actually very good examples of when I’d prefer a document database.

      • Janus007 June 3, 2012 at 12:05 am #

        Daniel and John, nobody would model such a db anyway and if they do it anyway they lack some basic skills in db modelling.

        In the Invoice/ ordering-model a Customer has an Order and an Order has some Order Lines, when the Customer order/ save/ pay the data will come from the Customer and Product – db. The Product-data will be copied to the Order-system and not referenced to the Product-system :)

        John, if that should be modeled in NoSql it would look like:

        { “Order” : { “Customer” : “John Doe”,
        “OrderNumber” : “123″,
        “Products” : [ { "Name" : "Ajax",
        "Quantity" : "5",
        "SubTotal" : "100"
        },
        { "Discount" : "10%",
        "Name" : "Bar",
        "Quantity" : "15",
        "SubTotal" : "1000"
        },
        { "Name" : "Ajax",
        "Quantity" : "5",
        "SubTotal" : "100"
        }
        ],
        “TotalPrice” : “1850″
        } }

        Pay attention to the extra Discount :)

        The biggest difference between NoSql and traditional Sql is that you rarely need to join the data.

        NB: Also pay attention to the school example of creating a Customer/ Order – model, in the real world you would probably choose to denormalize Order and OrderLines :)

  4. dario-g November 7, 2012 at 11:36 am #

    So…
    If I use MSSQL like document database (means store whole aggregate root as serialized json/bson in one row) and do aggregates aka raven indexes by my self that MSSQL will be faster… much faster, right? :)

    • Daniel Lang November 7, 2012 at 11:56 am #

      Yes, sure. You would then effectively use MSSQL as a key/value store. The difference between a key/value store (like your example) and a document database like RavenDB is that the database understands the format of the documents and can do things for you (like building and maintaining indexes, partial document updates, etc.)

Trackbacks/Pingbacks

  1. RavenDB vs MSSQL: Which to Choose? | Go Peta Scale - May 10, 2012

    [...] Article: RavenDB vs MSSQL: Which to Choose?. RavenDB vs MSSQL: Which to Choose?: [...]

  2. NHibernate: ORM for .Net « Nathan - May 12, 2012

    [...] NHibernate in a Nutshell fluent-nhibernate / src / Examples.FirstProject / Program.cs Auto mapping RavenDB vs MSSQL Compartir esto:FacebookTwitterLinkedInImprimirMásStumbleUponRedditDiggCorreo electrónicoMe [...]

Leave a Reply