- Published on
Data Warehouses
- Authors
- Name
- Arunabh Bora
- @arunabh223
What is a Data Warehouse?
A data warehouse is an analytics database which stores and processes data. This is where we combine data from multiple sources. Also, It provides a centralized space where end users can query the data.
What make a data warehouse so special?
Data warehouses are specialized in analytical workloads. Queries can scan multiple rows and columns. Queries are large and can take up a lot of time and compute power.
Transactional workloads have many simple queries, whereas analytical workloads have few heavy queries.
Because of this, the underlying architecture of data warehouses is quite special.
- Columnar storage engine: Instead of storing data row by row on disk, analytical databases group columns of data together and store them.
- Compression of columnar data: Data within each column is compressed for smaller storage and faster retrieval.
- Parallelization of query executions: Modern analytical databases are typically run on top of thousands of machines. Each analytical query can thus be split into multiple smaller queries to be executed in parallel among those machines (divide and conquer strategy)