The Guide to Apache Druid Architectures
Michael Driscoll
Michael Driscoll
Michael Driscoll
October 4, 2021

As we build out the Rill Data team, we often encounter folks who are new to Apache Druid and looking for ways to get up to speed quickly. For this reason we maintain this “Guide to Apache Druid.” It’s meant to be a balanced list of articles, customer stories, and architectural diagrams that best helped us get up to speed answering questions like:

  • Who uses Apache Druid?
  • What are they using it for?
  • How does it fit in with the other pieces of the Modern Data Stack?

This is a living document so when particularly relevant pieces come up we’ll update this page.

The Apache Druid site

The open source Apache Druid project site itself is always a good place to start learning about Druid, and in particular, the exhaustive list of companies using Druid.

Customer Stories

It’s always best to hear directly from users exactly how they’re using Druid inside their company. While the community-maintained list of companies above is fairly exhaustive, these stories below are some of our favorites (in no particular order):

Druid Architecture

What is the internal architecture of Apache Druid and how is it different from other OLAP databases?

Reference Architecture Diagrams

OLAP databases like Apache Druid are enjoying a resurgence in popularity amidst increasingly complex data operations. But as the “Modern Data Stack” matures, it’s always fascinating to see the variety of ways companies arrange the pieces of their data stack. The videos and articles below all include an architectural diagram—redrawn here to facilitate comparison, and including links to the original piece.

Pinterest, 2020

Archmage, Pinterest’s Real-time Analytics Platform on Druid

Pinterest Apache Druid Architecture

Netflix, 2020

How Netflix uses Druid for Real-time Insights to Ensure a High-Quality Experience

Netflix Apache Druid Architecture

GumGum, 2020

Optimized Real-time Analytics using Spark Streaming and Apache Druid

GumGum Apache Druid Architecture

Salesforce, 2020

Delivering High-Quality Insights Interactively Using Apache Druid at Salesforce

Salesforce Apache Druid Architecture

Reddit, 2021

Scaling Reporting at Reddit

Reddit Apache Druid Architecture

eBay, 2019

Monitoring at eBay with Druid

eBay Apache Druid Architecture

Naver, 2018

Web analytics at scale with Druid

Naver Apache Druid Architecture

Lyft, 2018 

Streaming SQL and Druid

Lyft Apache Druid Architecture

AirBnb, 2017

How Superset and Druid Power Real-Time Analytics at AirBnB

Batch Processing

AirBnb batch Apache Druid Architecture

Stream Processing

AirBnb stream Apache Druid Architecture

Related Technologies

Apache Superset with Maxime Beauchemin (Formerly Lyft, AirBnB, Facebook)—March 2019. Search for ‘Druid’ in Transcript to get a sense of why Maxime built Superset to run on Druid.

Did we miss something great?

We're always on the lookout for smart write-ups. Let us know if you found something that we overlooked.

Ready to see Rill in action? Try Free