Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Q 4. Regardless of the general acceptance of dimensionalmodeling, some mispercep

ID: 3610679 • Letter: Q

Question

Q 4. Regardless of the general acceptance of dimensionalmodeling, some misperceptions continue to be disseminated in theindustry. Explain following misperceptionbriefly:              Marks 15 (3 for each)

I. Dimensional models and data marts are for summarydata only

II. Dimensional models and data marts are departmental,not enterprise, solutions.

III. Dimensional models and data marts are notscalable.

IV. Dimensional models and data marts are onlyappropriate when there is a

predictable usage pattern.

V. Dimensional models and data marts can’t beintegrated and therefore lead to

stovepipe solutions.

Explanation / Answer

Dimensional models and data marts are for summary dataonly: This first myth is the root cause of manyill-designed dimensional models. Our data marts also will includecommonly requested summarized data in dimensional schemas. Thissummary data should complement the granular detail solely toprovide improved performance for common queries, but not attempt toserve as a replacement for the details. A related corollary to thisfirst myth is that only a limited amount of historical data shouldbe stored in dimensional structures. There is nothing about adimensional model that prohibits the storage of substantialhistory. The amount of history available in data marts must bedriven by the business’s requirements.

Dimensional models and data marts are departmental, notenterprise, solutions: Rather than drawing boundariesbased on organizational departments, we maintain that data martsshould be organized around business processes, such as orders,invoices, and service calls. Supporters of the normalized datawarehouse approach sometimes draw spiderweb diagrams with multipleextracts from the same source feeding into multiple data marts. Theillustration supposedly depicts the perils of proceeding without anormalized data warehouse to feed the data marts. These supporterscaution about increased costs and potential inconsistencies aschanges in the source system of record would need to be rippled toeach mart’s ETL process. This argument falls apart because noone advocates multiple extracts from the same source. The spiderwebdiagrams fail to appreciate that the data marts areprocess-centric, not department-centric, and that the data isextracted once from the operational source and presented in asingle place.

Dimensional models and data marts are notscalable: Modern fact tables have many billions of rows inthem. The dimensional models within our data marts are extremelyscalable. Relational DBMS vendors have embraced data warehousingand incorporated numerous capabilities into their products tooptimize the scalability and performance of dimensionalmodels.

Dimensional models and data marts are only appropriatewhen there is a predictable usage pattern: The dimensionalstructures in our data marts are extremely flexible and adaptive tochange. The secret to query flexibility is building the fact tablesat the most granular level. In our opinion, the source of myth 4 isthe designer struggling with fact tables that have been prematurelyaggregated based on the designer’s unfortunate belief in myth1 regarding summary data. The correct starting point for yourdimensional models is to express data at the lowest detail possiblefor maximum flexibility and extensibility.