Nov 08 2009

Predictions on the future of NoSQL

Category: Ideas, UncategorizedAleksander Kmetec @ 12:02 pm

These days new NoSQL databases are springing up faster than URL shorteners. Even incomplete lists are likely to mention over 50 of them, most of them never heard of before. But even though they’re all being lumped together under the same name, they are so different from each other that there’s still a dilemma how to come up with a name which would describe them for what they are instead of describing them for what they’re not.

Nobody knows for sure what the future of NoSQL will be like, which way the development is most likely to head and who the winners will be, but we can still try and make some predictions. Here are mine:

Several subgroups will emerge
This is not as much a prediction as much as it is an observation of already visible patterns. At least two main groups will emerge from the NoSQL movement: networked data structure servers (key/value stores, queues, …) and databases for working with structured data.

The data structure branch will remain very diverse
Typical software in this category is rather minimalistic, both in terms of functionality and in terms of code size. Thanks to this the threshold for entry of new players is rather low; also low is the price paid by users for switching between competing implementations.
Various players will likely specialize in some technological niche and they will continue to be used mainly as means of speeding up applications and not as fundamental building blocks.

Relational databases will fight back
Some databases already have support for storing, manipulating and indexing structured data in the form of XML. I have a feeling that JSON support can’t be far away. For most users this will be enough to stay with the established players in the database field instead of choosing a strictly document oriented database.

Document oriented databases will morph into graph databases
Implementing cross document referencing will take them half way there. Pressure from the relational databases, as described above, will push them the rest of the way.

SPARQL will become the query language of choice
SPARQL is a query language for RDF1; in other words: a language for querying graphs. It supports querying multiple data sources at the same time (federated queries) and there are projects underway to make it work with Hadoop clusters.
I’m not saying that alternative methods of querying will disappear completely! I’m just trying to say that the key players most likely to be used by the average developer (the next generation graph database equivalents of MySQL and PostgreSQL and similar) will end up standardizing on SPARQL instead of inventing yet another language.

Software will gain weight
Reading about NoSQL databases gives me a feeling of deja-vu. Most of it reads almost exactly like articles about MySQL from the beginning of this decade: “We’re better than competition because we don’t have transactions/triggers/datatype checking/guaranteed consistency/fulltext search/…”. MySQL now has all of those features and NoSQL databases will follow in the same path. Most users will start hitting walls due to lack of features, not due to performance issues and when that happens having features will become more important than being lean.

The cycle will repeat itself
After a decade or so a new class of players claiming that their lack of features is their strength will emerge once again

  1. RDF is an extremely simple format wrapped in a metric buttload of mystery in misunderstanding. But more about that some other time.

Tags: , , , , ,