Fast, predictable data tier for developers

As pointed out in my previous blog, table-stakes enterprise storage features such as replication, compression, and de-duplication are now provided for by the cloud native data tier. This is my first unlearning. Now, I’ll explore some of the asks from the data tier to the storage layer. The one which I want to cover in this post is providing fast and predictable I/O access. What does this mean?

In talking to quite a few customers and application developers, user attention span is a key priority when designing a user facing app. How many customers/queries/logins/search requests/ads served etc. must the app sustain? Let’s assume an attention span of two seconds:

90 percentile ——> 95 percentile ——> 99 percentile ——> 99.99 percentile

Basically the buckets defined above mean that 90% of requests are handled under two seconds; 95% of requests are handled under two seconds and similarly 99% requests. If app responsiveness relative to attention span equates to profitability, each bucket can represent an order-of-magnitude higher profitability.

Therefore, one asks how a storage sub-system can help achieve app responsiveness? What about under a dynamic workload?

Storage quality-of-service is increasingly becoming important, especially predictable latencies. Averages (50 percentile) for these metrics can be deceiving!

Storage unlearning #2: quoting storage latency in terms of averages is a NON starter.

Don’t get me wrong; studying average latencies can be very useful to see what is happening with the workload. As you exceed the limits of a resource such as storage, average latencies invariably increase. An increasing average, therefore, is a leading indicator of reaching a resource limit.

nice blog from Nimble Storage describes a different aspect of I/O, namely average I/O size. The basic point is that quoting average I/O size can be deceiving: I/O size for most workloads studied by Nimble Insights is actually bi-modal. The Lesson is that if an application issues one 4k sized I/O; and one 128k sized I/O; it does not mean that the average I/O size the application issues is 64k !

In later posts I’ll address the causes of non-deterministic storage performance.