Published on

Data Analytics Stack

Authors

When considering a data analytics stack, we just need to consider two things - it should be simple and scalable.

An analytics system has to do 3 things

  1. Load data into a central repository
  2. Transform the data into a usable format
  3. Serve the data for business users

A data warehouse is at the core of a enterprise analytics.

A data warehouse is nothing more than a database that is optimized for analytical processing. We’ll dive deep on the data warehouse in the next section, but for now, let’s stay focused on the data loading process.

A question you might ask is “why move data to an analytical database? Isn’t the data already on a database?”

Connecting directly to production database can cause performance issues. As it already has to handle production workload, it might slow down if we add more analytics workload to it. Moreover, analytics involves things like aggregation which is not efficient when done directly from the production database. Therefore, we usually replicate the production database by taking a “snapshot” of it at fixed intervals.

Consolidation is important because it makes data easier to work with.

Data Processing

Data in its raw format is not usable. We will of course need to convert it into a more usable form so that end users can work with it. The aim is to serve curated content to users. For example, serving inventory, purchase order data to Supply Chain and serving Cashflow, receivables data to Finance. Users will find it easier to work with their data if its not mixed up.

Data reporting

This is where a reporting/visualization tool helps. Here, we can do two things. We can create a dashboard based on their requirements and present it to them as a finished product. Or, we can give them cleaned data, so they can visualize it as they want to. It depends on how much the end users are comfortable working with data. After all, they are ultimately its owners.

Analytics Stack