Data Warehouse Concepts Definition & Types.

What Data warehouse Concept is? Whether, it is a process or a system or a simply a repository with data collected from multiple sources placed into one large database? Often people confuse the subject ; with so-many interpretations available out there. However , Bill Inmon’s definition is more precise and covers the aspects of true data warehouse.

Background Reading : What Is Database?

What you will learn today

  • What Data Warehouse Is-The Definition.
  • History & Evolution of Data Warehouse ( DWH).
  • Why Data Warehouse Is Needed.
  • Types Of Data Warehouse.
  • Architecture of Data Warehouse.

Data Warehouse Definition

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process

Bill Inmon

Other famous definition came from Kimball

A data warehouse is a copy of transaction data specifically structured for query and analysis.

Data Warehouse Kimball

Both these leaders of the industry has given us a definition of a true data warehouse . Kimball covered the functional side of the DWH and Inmon covered the functional and technical side of the issue. They both have given wonderful theories on how to develop data warehouse . Go through this short article Inmon Vs Kimball to grasp about ideas .

What is a Data Warehouse

A deeper discussion on Inmon’s Definition is essential . Therefore , see below four points to grasp what data warehouse really is :

  • Subject-Oriented : Subject means the topic under analysis perspective. Finance , HR , Sales , Attendance , Leaves etc etc are all subjects that can interest Analyst to perform analysis.
  • Integrated : Multiple data sources , departments , date formats , currency conversions , time conversions all such issues are converted according to the base rule for each of them.
  • Time-Variant : Changes in data over time. Decades old data can reside in DWH and pulled for analysis if required . For an organisation working that long , they can pull the data from 10 years ago and perform analysis as well as performing sales performance from yesterday. Powerful enough! eh?
  • Non-Volatile : DWH should not change the soul of the data. It should be traceable to it’s source if necessary. Once the data loads into DWH , the only change should be the next load. Therefore, no data input from any user from outside. OLTP data will be input for the DWH.

History & Evolution Of Data Warehouse

Refer to the below info-graphics for a short introduction on Evolution Of Data Warehouse.

What Data Warehouse Is
Evolution of DWH

Do you ever look at the things and think how come humans came to conclusion that they need in this specific way ? If you don’t you should . Because , this will lead you to cause-reason analysis which is the core of every data oriented career

Why Data Warehouse Is Needed?

With arrivals of computer and their inclusion in industry , people started putting in numbers and storing data. Early on , storing data was the biggest goal. Relational DBMS was a solution to many issues and allowed abundant data to be stored in short time . In 1990s data storage was doubling every 18 months. Databases grew larger and larger. At this point , management was usually happy in seeing weekly , monthly , quarterly reports.

As the data grew and became available over the years in different forms and different departments . Simple reporting just couldn’t cut it. Information crisis appeared and lack of strategic information was hurting or stopping growth of global organisations .

DHW resolved these issues and paved the way for Data Analytics , Data Engineering and Data Science .

4 Reasons DWH is a Needed In Every Organisation:

  • Data Integrity & Security : Data is in one place and all reports have underlying from this data repository. This ensures the data has integrity and is secure.
  • Quick Reporting : You don’t need to compile data every time for every query. Rather than spending major time on preparing data for analysis . Data Analysts can spend more time on analysing it.
  • Historical Data : Forget about weekly , monthly or quarterly reports. DWH can provide you data from hour to hour or years over years. Compare year over year growth , month by month or any time depending on your requirement. See the trend lines spanning over any time period.
  • Multiple Data Sources into One : Data from Finance , HR , operations department , Sales People all integrated in one place . You can link all this data and find relationship between them. Data from different types of databases or file formats can be stored also.

All these points have world of their own. Hence , We will discuss all of this later in different articles.


Types of Data Warehouse

At an abstract level , we have three types of Data Warehouse :

  • Enterprise Data Warehouse ( DWH )
  • Operational Data Store
  • Data Marts

Which DWH to use , how to create it. Selection of tools , techniques and methods all depends on resources , requirements you have . We will discuss each and see what can you do and what could be the limitation in them.

Enterprise Data Warehouse

This type of DWH has data for the whole enterprise as in centralise in one place for decision support systems and decision makers. This means that most of the data will be in one place. Therefore , complexity for developers and maintenance will increase and size of DWH can increase dramatically.

Update Frequency/ Granularity : Depending upon requirements and granularity of DWH

Operational Data Store

This is also known as ODS and is my least favourite . Reason ODS updates nearly at real-time pace. This has a scope of enterprise but it comes with issues of its own. Computing resources and the cost of such a system can increase with the increase in data.

Update Frequency/ Granularity : Near Real time.

Data Mart 

Data Mart is a subset of DWH. Subset can base on department , locations , business unit or function

This type of DWH is easily the best kind . Data Mart can allow analysis according to the roles of users. For instance , Managers in Asia can see data only related to Asia offices because the subset is on locations.

Data Marts can behave independently and as one whatever the requirement is . This gives the flexibility and data security so that only relevant people can see the relevant data.

Update Frequency/ Granularity : Depending upon requirements and granularity of DWH


Conclusion

This article explains what truly DWH is and the types. This is all background reading and theory that clears the need of DWH. Plenty of lines can be written on each topic . Because , there has been huge debates on these topics. Therefore, we will only discuss them shortly. If interested , you can study them in detail.

Next we will discuss Architecture of DWH, OLTP Vs OLAP .

Join 2,256 other subscribers

Subscribe Us !

Leave a Reply