How to achieve fast query speed with no DevOps maintenance
The transition from on-prem to cloud has picked up speed. While just five years ago, companies were resisting moving to the cloud due to data security concerns, the more common question now is, “How can I move to the cloud as quickly and cost efficiently as possible?” The reality is, enhanced security across cloud infrastructure and applications has tamed these concerns and opened up the possibility for most companies to move their processing to the cloud.
Data analytics is an area that is ripe for a transition to the cloud. The complexity of managing data, storage, memory, and compute in a manner that is both performant and cost effective is an almost impossible hurdle for many companies. But what does it mean to move analytics into the cloud?
From IaaS to SaaS
If you’ve been in the industry even just a few years, you remember the “server room” — that freezing cold room with the racks of servers running the jewels of the company and a team of IT folks dedicated to keeping it running 24/7. With IaaS (Infrastructure as a Service), that server room now lives in the cloud. But does IaaS help with your data analytics headache? Not really. Installation, configuration, and scaling of analytics software across multiple compute servers, disk, and memory is a sizable “ops” job. Its complexity has resulted in the explosion of SaaS data platforms such as Snowflake, Google BigQuery, and Amazon Redshift. These data platforms provide the software as a service, which means that the customer no longer needs to install, configure, or scale the software.
If one of these data platforms provides the performance that you need, it’s a clear win. While the cost is significant, these SaaS data platforms bring your operational complexity down to the “NoOps” level. Your team can focus on its core business analytics needs and avoid spending precious resources on software and infrastructure management.
From analyzing data to sub-second queries
However, business analytics does not stand still. While business insights with 24-72 hour latencies may have been sufficient in the past, insights from real time data will quickly become a key differentiator when harnessing value from data. In the future, the company that monitors their market in real time and can change course in an instant will have a clear edge over their competitors that are generating reports from historical data.
While Snowflake, BigQuery, and Amazon Redshift (“the Big Three”) provide excellent platforms for analyzing historical data, the bleeding edge of analytics requires operational analytics platforms such as Druid, Pinot, and Clickhouse. Each of these platforms can be easily downloaded and installed. However, that’s just the tip of the iceberg. The trick is in architecting a solution that handles the size, complexity, and bursty nature of data. Add on the performance requirements of a broad range of users across multiple data sets, and suddenly, although you have an operational data platform at your disposal, you have an analytics architecture headache that can easily drag you down.
We’ve seen the power of SaaS via the success of Snowflake, BigQuery, and Amazon Redshift. These services took their customers from HighOps to NoOps, and those companies that built their own analytics platforms and remained in the HighOps world are now struggling to keep up with their competitors who enjoy the simplicity and ease of these NoOps SaaS analytics platforms.
Is there any reason to think that operational analytics will be any different? As companies move into the realm of operational intelligence, it is almost certain that the same lesson will repeat itself. Those companies that choose a fully managed solution with no operational overhead will be more agile, cost effective, and successful at pushing the limits of operational analytics than their competitors who spend time and resources architecting a customized data service.
The fast path to operational intelligence without the maintenance
For those companies where the big three SaaS services can’t meet their sub-second query needs, Rill Data provides a complementary SaaS data platform focused on operational data with real-time data ingestion and fast sub-second query requirements.
Rill utilizes Apache Druid to provide fast performance at cost. We wrap Druid in enterprise level governance and security and provide elastically available high performance access to your “hot” data — that data where you need immediate and sub-second access. Rill supports auto scaling and hibernation to provide sub-second query latency at low cost. You input your data through a variety of standard connectors and your data visualization is achieved through standard tools such as Looker, Tableau, and Superset, or if you have special needs, via SQL queries or REST APIs.
Druid is complex and architecting a data service on top of Druid that meets your business requirements requires a great deal of knowledge and a team of data architects. Is your core value the architecture of your data platform? Chances are, your resources are better spent finding data insights than managing your data platform. At Rill, our goal is to provide you a NoOps solution to sub-second queries on terabyte data. It’s secure. It autoscales. And we’ve built it using the simple and proven paradigm of SaaS as a data platform, which means no management overhead for you or your team.