Our free Data Engineering guides bring together our thinking and advice on how to deploy modern cloud based data and analytics stacks.  

Databricks
Databricks is the leading big data analytics environment. Based on Apache Spark, Databricks allows data engineers, data scientists and analaysts to all collaborate on a single platform using their preferred languages and toolkits. What Are Spark & Databricks And How Can They Benefit Your Busin…
Kafka
Kafka is the industry standard platform for real time exchange of streaming data. Rather than moving between end points as large infrequent batches, Kafka allows businesses to become more real time. What Is Apache Kafka and What Can It Do For Your Business?Over time, the typical business acquires mo…
DBT
DBT is an open source tool which is used to manage data transformations directly within data warehouses. DBT modernises the practices of Extract Transform and Load to be similar to those of a software engineer, making them faster, more reliable and more scalable. How DBT Helps Data Engineers Work…
Snowflake
Snowflake is the leading cloud data platform.
Apache Druid
Apache Druid is a powerful OLAP database which combines the best of NoSQL, Time Series and other databases. It aims to provide interactive sub-second performance for extremely large datasets. What Is Apache Druid And What Can It Do For Your BusinessThere are many databases in the world, all offer…
Streaming Data
Streaming Data is about processing and responding to incoming data, as it happens in real time. Whereas most businesses run on delayed batch data, companies that implement streaming technology can benefit from improved customer experience. Streaming Technology - Sourcing EventsIn order to move towa…