Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Q No.1 What is the role of Hierarchy Schema and Hierarchy Domain for the represe

ID: 3618724 • Letter: Q

Question

Q No.1                                                                                                                                

            

What is the role of Hierarchy Schemaand Hierarchy Domain for the representation ofOLAP dimension?

Q No.2                                                                                                                                 

What are the two approaches used to handle structuralheterogeneity in OLAP to adapt or transform thedimensions in order to obtain homogenous data?

Explanation / Answer

ans 2)One of the most important approaches to the integrationof data sources is based on a data warehouse architecture. In thisarchitecture, data coming from multiple external data sources(EDSs) are extracted, filtered, merged, and stored in a centralrepository, called a data warehouse (DW). Data are also enriched byhistorical and summary information. From a technological point ofview, a data warehouse is a huge database from several hundred GBto several dozens of TB. Thanks to this architecture, users operateon a local, homogeneous, and centralized data repository thatreduces access time to data. Moreover, a data warehouse isindependent of EDSs that may be temporarily unavailable. However, adata warehouse has to be kept up to date with respect to thecontent of EDSs, by being periodically refreshed.



In XML, it is possible to represent semantically similarinformation in multiple different
ways within one document and between documents. This leads to dataheterogeneity.
XML documents suffer from three types of heterogeneity. In semanticheterogeneity,
semantically similar information is represented by different namesor dissimilar information
by the same names. In syntactic content heterogeneity, semanticallythe same content
is expressed in different languages (French, English) or units ofmeasurement ($, €, ¥; oF,
oC). Finally, in structural heterogeneity the same or similar datais organized in structurally
different ways, e.g., in different levels of hierarchy. In additionto those, a specific
piece of information can be represented in XML documents as a nameof an element, as a
name of an attribute, or as their values. These types ofheterogeneity are independent of
each other and all combinations among them may appearsimultaneously. The heterogeneity
between information sources must be harmonized before meaningfuldata cubes can
be constructed based on XML documents. In this paper, we focus onconstruction of data
cubes in structurally heterogeneous XML environments. Thus, we donot consider multidimensional
analysis or OLAP per se.



3. Structural heterogeneity: the system should
3.1 not require its user to master the structural diversity in XMLstructures in detail or
to know which kinds of components (elements or attributes) are usedto represent
the information;
3.2 not require its user to specify explicitly the navigation inXML structures;
3.3 relieve its user from writing complex structural dataintegration specifications;
and
3.4 therefore, execute automatically structural data integration onthe basis of compact,
declarative and highlevel
specifications. ans 1)Hierarchy Schema and Instance

Intuitively, data hierarchy is a tree with each
node being a tuple over a set of attributes. A
dimension hierarchy is based on a hierarchical
attribute, also referred to as the analysis criterion,
propagated to all levels of the tree. It is
possible to impose different hierarchies within
the same dimension by defining multiple criteria,
for instance, the projects can be analyzed
along the hierarchy of geographic locations or
along that of supervising institutions.
Definition 2.1.1. A hierarchical domain is a nonempty
set VH with the only defined predicates
= (identity), < (child/parent relationship), and
<< (transitive closure, or descendant/ancestor
relationship) such that the graph G< over the
nodes ei of VH is a tree. Attribute A of VH is called
a hierarchical attribute.
A hierarchy H is non-strict whenever (e1,
e2, e3) VH : e1 < e2 ? e1 < e3 ? e2 e3, or,informally,
if any node is allowed to have more
than one parent.
Definition 2.1.2. A hierarchy schema H is a
four-tuple (C, ôH , •H , H), where C = {Cj, j =1,
…, k}is a set of category types, ôH is a partial
order on C, •H is a distinguished root category
and H is the bottom level of the ordering.
Predicates =, ô, and ô* specify identity,
child/parent, and descendant/ancestor relationship,
respectively, between the category types
in C. The only possible relation of a category
with its own self is identity (i.e., the category
itself does not belong to the subset of its descendant
categories). Cj is said to be a category
type in H, denoted Cj H, if Cj ô* C. Thereby,
the hierarchy schema defines a skeleton of the
associated data tree, for which the following
conditions hold: