We are living in the age of a data revolution, and more corporations are realizing that to lead—or in some cases, to survive—they need to harness their data wealth effectively. The data warehouse, due to its unique proposition as the integrated enterprise repository of data, is playing an even more important role in this situation. There are two prominent architecture styles practiced today to build a data warehouse: the Inmon architecture and the Kimball architecture. This paper attempts to compare and contrast the pros and cons of each architecture style and to recommend which style to pursue based on certain factors. In terms of how to architect the data warehouse, there are two distinctive schools of thought: the Inmon method and Kimball method.
|Published (Last):||17 June 2005|
|PDF File Size:||1.59 Mb|
|ePub File Size:||1.77 Mb|
|Price:||Free* [*Free Regsitration Required]|
What is the best methodology to use when creating a data warehouse? Once you decide to build a data warehouse, the next step is deciding between a normalized versus dimensional approach for the storage of data in the data warehouse. A key advantage of a dimensional approach is that the data warehouse is easier for the user to understand and to use. Also, the retrieval of data from the data warehouse tends to operate very quickly.
Plus, if you are used to working with a normalized approach, it can take a while to fully understand the dimensional approach and to become efficient in building one. The normalized structure divides data into entities, which creates several tables in a relational database.
When applied in large enterprises the result is dozens of tables that are linked together by a web of joins. Furthermore, each of the created entities is converted into separate physical tables when the database is implemented. The main advantage of this approach is that it is straightforward to add information into the database.
The final step in building a data warehouse is deciding between using a top-down versus bottom-up design methodology. These data marts are eventually integrated together to create a data warehouse using a bus architecture, which consists of conformed dimensions between all the data marts. So the data warehouse ends up being segmented into a number of logically self-contained and consistent data marts, rather than a big and complex centralized model.
Dimensional data marts containing data needed for specific business processes or specific departments are created from the enterprise data warehouse only after the complete data warehouse has been created. The work is a long-term, construction will last a long time, but the return is expected to be a long-lasting and reliable data architecture.
It is popular because business users can see some results quickly, with the risk you may create duplicate data or may have to redo part of a design because there was no master plan. With Inmon there is a master plan and usually you will not have to redo anything, but if could be a while before you see any benefits, and the up-front cost is significant. And another risk is by the time you start generating results, the business source data has changed or there is changed priorities and you may have to redo some work anyway.
Is it Relevant? Top Five Benefits of a Data Warehouse. The 10 Essential Rules of Dimensional Modeling. Normalizing Your Database. Information architecture is a matter. LinkedIn discussion What formal data architectures do we have that represent a compromise between Inmon and Kimball? Inmon vs Kimball. Kimball vs. Inmon…or, How to build a Data Warehouse. Kimball versus Inmon: a peace offer? Inmon vs. Kimball — An Analysis. Why You Need a Data Warehouse. The Kimball bus architecture and the Corporate Information Factory: What are the fundamental differences?
Kimball or Inmon in an enterprise environment. James, You seem to be conflating Architecture with Methodology. Agile, iterative approaches are surely very popular with BI projects these days and both Inmon and Kimball architectures are often implemented using an agile approach. With a normalized warehouse it is typically easier to add new data sources and evolve the warehouse model because it is less tightly coupled to any one set of reporting requirements and because there are fewer moving parts transformation layer on the upstream side of the warehouse.
The downstream side between warehouse and marts is where decision-support business logic goes and that is simplified too because it only has to consume data already validated and integrated into the data warehouse.
I agree with the advantage D points out. Having integrated the data into the normalized data warehouse also leads to much more consistency across the various data marts in terms of their data models and vocabulary. This is certainly the approach I prefer. Also, a small correction regarding terminology. Inmon offers no methodolgy for data marts. If you use Kimballs atomic data mart methodology with Inmons CIF you end up with 2 full copies of source transactions. Imon is subject oriented meaning all business processes for each subject for example client need to be modelled before the EDW can be a single version of the truth.
This takes a LONG time. Hey there! Just wanted to say keep up the excellent job! James Serra's Blog. Skip to content. This will allow for better business decisions because users will have access to more data. Instead, create a data warehouse so users can run reports off of that. About James Serra James is a big data and data warehousing solution architect at Microsoft. Bookmark the permalink. March 13, at am. George M says:. March 23, at am.
Mark Hargraves says:. June 11, at pm. Richard M says:. April 30, at pm. August 31, at pm. James Serra says:. Dejan says:. March 12, at pm. Search for:. I am a big data and data warehousing solution architect at Microsoft. Proudly powered by WordPress. Weaver by WeaverTheme. Sorry, your blog cannot share posts by email.
Data Warehouse Concepts: Kimball vs. Inmon Approach
When it comes to designing a data warehouse for your business, the two most commonly discussed methods are the approaches introduced by Bill Inmon and Ralph Kimball. Debates on which one is better and more effective have lasted for years. But a clear-cut answer has never been arrived upon, as both philosophies have their own advantages and differentiating factors, and enterprises continue to use either of these. Inmon defines a data warehouse as a centralised repository for the entire enterprise. Dimensional data marts are created only after the complete data warehouse has been created. Thus, the data warehouse is at the centre of the corporate information factory CIF , which provides a logical framework for delivering business intelligence.
Ralph Kimball Data Warehouse Architecture
When it comes to data warehouse designing, two of the most widely discussed approaches are the Inmon method and Kimball method. For years, people have debated over which one is better and more effective for businesses. Initiated by Ralph Kimball, this data warehouse concept follows a bottom-up approach to data warehouse architecture design in which data marts are formed first based on the business requirements. The primary data sources are then evaluated, and an Extract, Transform and Load ETL tool is used to fetch different types of data formats from several sources and load it into a staging area. This model partitions data into fact table, which is numeric transactional data or dimension table, which is the reference information that supports facts. The star schema is the fundamental element of dimensional modeling in which a fact table is bounded by several dimensions. Several star schemas can be constructed within a dimensional model to fulfill various reporting needs.
Summary : in this article, we will discuss the differences between Kimball and Inmon in data warehouse architecture approach. To those who are unfamiliar with Ralph Kimball and Bill Inmon data warehouse architectures please read the following articles:. Both architectures have an enterprise focus that supports information analysis across the organization. This approach enables to address the business requirements not only within a subject area but also across subject areas. Bill Inmon recommends building the data warehouse that follows the top-down approach.
Dimensional modeling DM is part of the Business Dimensional Lifecycle methodology developed by Ralph Kimball which includes a set of methods, techniques and concepts for use in data warehouse design. Dimensional modeling always uses the concepts of facts measures , and dimensions context. Facts are typically but not always numeric values that can be aggregated, and dimensions are groups of hierarchies and descriptors that define the facts. For example, sales amount is a fact; timestamp, product, register , store , etc.