Unlearning decades of design principles for enterprise storage

Well, what else can I say; the title says it ALL! And for me, the process of un-learning still continues. Long-held assumptions about what storage capabilities are needed to support applications are increasingly less valid. “What happened?” you might ask. Unless one has been living under a rock, one can’t miss the fact that the world of enterprise infrastructure is under-going a tectonic shift before our own eyes. The reasons go beyond the obvious including media changes (e.g., NAND flash) and movement to cloud (e.g., Amazon, Azure, Google).

The advent of cloud-native applications architecture is arguably responsible for much of this change. While I won’t repeat the many good definitions of cloud-native applications, suffice it to say the software components shown here are a big part of it!

As you’ll notice cloud-native applications are NOT constituted entirely of so called “stateless” services, but also a rich set of “stateful” SQL databases and NoSQL stores, as well as message queues. A good way to term the stateful in my mind is to call them the “data tier.” It is when you look at data tier architecture that one finds a common design theme, such as this example from the Mongodb Architecture Guide:

This Mongodb example, with little variances, is seen as an underlying architecture theme across all the variants of the data tier. In a nutshell it is shard for write scalability and replica sets for read scalability. The replica sets further serve as nodes for HA, DR and all other storage services (ex. backup, analytics etc.).


A question of unlearning: what are the requirements given to the storage sub-system from the data tier?

Looking at the Mongodb figures above, it is clear that storage services like high availability, replication, compression, and encryption are provided for by Mongodb.

Storage unlearning #1: table-stakes storage features (RAID, thin-provisioning, snapshots, replication, compression/de-duplication, etc. – AKA an enterprise storage array) are now provided for by the data tier.

So what are the asks from the data tier for the storage sub-system? We’ll explore some of them in my next blog!