In many situations, the earlier we respond to incoming data the better. This might be in a genuinely real time situation such as a self driving car, a trading system or a fraud check, or a more vanilla business scenario such as a product out of stock which we hope to inform our users about as soon as possible.
The value of data is said to decay over time. The sooner we can respond as a business, the sooner we can use the data to improve the customer experience, operate more efficiently or capture revenues. If too much time passes after capturing the data, these opportunities fall away exponentially.
For this reason, many companies are looking to process their data much faster, if not in real time, as part of their digital transformation ambitions.
This can however be technically challenging with traditional approaches to data engineering and business intelligence, which are more based around periodic delivery of batch data and relatively simple slice and dice analysis once it’s received.
The first thing companies need to do is refresh and re-engineer their data platforms to deliver data faster. This could involve something simple like more frequent extract, transform and load from source systems, or something more complex such as moving to a streaming architecture. This data would then commonly be ingested into storage such as a data warehouse or data lake, and made visible through reports and dashboards earlier than it has been historically.
For many companies and business scenarios, slightly faster delivery of data into the hands of business users might be enough. If you have a few tens of thousands of rows in a relational database, putting a dashboard over the top and getting a relatively real time view of the business is feasible. It’s effectively a minimum maturity level for real time analytics though.
To deliver on many business outcomes, real time analytics becomes harder, and the associated technology becomes more complex. For instance, we need to deliver:
This adds up to a much more complex picture than simply doing faster business intelligence.
As with many techniques, cloud makes this problem more tractable. There are number of higher level technologies we can use for streaming and then processing the data in flight and at rest. We can get access to the compute horsepower we need to continually ask complex questions, and can use scale up automatically in response to changing patterns in the data. The pricing model is all consumption based rather than needing to provision for a worst case scenario making the cost profile better too.
We believe that intelligent Real Time analytics are going to be a significant differentiator for business going forward. We are seeing many businesses implementing this capability in cloud environments already, and expect it to be a major theme for data teams over the coming decade.