Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Data Warehousing related questions. List and explain briefly (in one sentence) t

ID: 3716077 • Letter: D

Question

Data Warehousing related questions.

List and explain briefly (in one sentence) the four basic Cloud Computing Deployment Models that we discussed in this class.

List five known security issues or threats associated with Cloud Computing.

What is PaaS? Give one example also.

What is SaaS? Give one example also.

What is IaaS? Give one example also.

What is DRaaS? ( and also give one example of a DRaaS vendor)

Name five benefits of Cloud Computing from an economic perspective.

Define Cloud Computing (referencing the NIST Definition is acceptable).

List five enabling technologies that have given rise to Cloud Computing.

List five characteristics of Cloud Computing.

List three Critical Success Factors related to the evaluation of a Data Warehouse Project.

List some ways that you can resolve the issues that arise during the ETL Processes of a Data Warehouse Project.

List three issues that may occur during the ETL Processes of a Data Warehouse Project.

List one activity that occurs in the Construction Phase of a Data Warehouse Project.

List one activity that occurs in the Design Phase of a Data Warehouse Project.

List one activity that occurs in the Planning Phase of a Data Warehouse Project.

List one activity that occurs in the Requirements Definition Phase of a Data Warehouse Project.

List one activity that occurs in the Maintenance Phase of a Data Warehouse Project.

List one activity that occurs in the Deployment Phase of a Data Warehouse Project.

In which Data Warehouse project phase do the ETL processes occur?

What is a Logical Data Map and what is its purpose during the ETL Planning Phase of a Data Warehouse Project.

Why would you drop indexes during the Load Operation of the ETL Phase of a Data Warehouse Project?

Why would you disable foreign key constraint (referential integrity) before loading data during the Load Operation of an ETL Project Phase?

What is the Staging Area that is used during the ETL phases of Data Warehouse Project?

Name one type of Source Data System that is used to extract data during the ETL phases of a Data Warehouse Project.

During a Data Warehouse Project, which tables should be loaded first? Dimension Tables or Fact Tables?

If you extract data to a file on a UNIX system, what is the bash command to sort the data?

The _____    ______ loads the data into the data warehouse. Depending on the requirements of the organization, this process ranges widely. Some data warehouses merely overwrite old information with new data.

The _____    ________ applies a series of rules or functions to the extracted data to derive the data to be loaded. Some data sources will require very little manipulation of data.

The first part of an ETL process is to __________ the data from the source systems. Most data warehousing projects consolidate data from different source systems.

When participating in the ETL efforts associated with Data Warehouse System development, a typical step is to develop the Source System Tracking Report. What is a Source System Tracking Report, and list the steps involved in the creation of a Source System Tracking Report.

When participating in the ETL efforts associated with Data Warehouse System development, a typical step is to develop the Logical Data Map. What is a Logical Data Map, and list the steps involved in the creation of a Logical Data Map.

List and explain briefly (in one sentence) the four basic Cloud Computing Deployment Models that we discussed in this class.

List five known security issues or threats associated with Cloud Computing.

What is PaaS? Give one example also.

What is SaaS? Give one example also.

What is IaaS? Give one example also.

What is DRaaS? ( and also give one example of a DRaaS vendor)

Name five benefits of Cloud Computing from an economic perspective.

Define Cloud Computing (referencing the NIST Definition is acceptable).

List five enabling technologies that have given rise to Cloud Computing.

List five characteristics of Cloud Computing.

List three Critical Success Factors related to the evaluation of a Data Warehouse Project.

List some ways that you can resolve the issues that arise during the ETL Processes of a Data Warehouse Project.

List three issues that may occur during the ETL Processes of a Data Warehouse Project.

List one activity that occurs in the Construction Phase of a Data Warehouse Project.

List one activity that occurs in the Design Phase of a Data Warehouse Project.

List one activity that occurs in the Planning Phase of a Data Warehouse Project.

List one activity that occurs in the Requirements Definition Phase of a Data Warehouse Project.

List one activity that occurs in the Maintenance Phase of a Data Warehouse Project.

List one activity that occurs in the Deployment Phase of a Data Warehouse Project.

In which Data Warehouse project phase do the ETL processes occur?

What is a Logical Data Map and what is its purpose during the ETL Planning Phase of a Data Warehouse Project.

Why would you drop indexes during the Load Operation of the ETL Phase of a Data Warehouse Project?

Why would you disable foreign key constraint (referential integrity) before loading data during the Load Operation of an ETL Project Phase?

What is the Staging Area that is used during the ETL phases of Data Warehouse Project?

Name one type of Source Data System that is used to extract data during the ETL phases of a Data Warehouse Project.

During a Data Warehouse Project, which tables should be loaded first? Dimension Tables or Fact Tables?

If you extract data to a file on a UNIX system, what is the bash command to sort the data?

The _____    ______ loads the data into the data warehouse. Depending on the requirements of the organization, this process ranges widely. Some data warehouses merely overwrite old information with new data.

The _____    ________ applies a series of rules or functions to the extracted data to derive the data to be loaded. Some data sources will require very little manipulation of data.

The first part of an ETL process is to __________ the data from the source systems. Most data warehousing projects consolidate data from different source systems.

When participating in the ETL efforts associated with Data Warehouse System development, a typical step is to develop the Source System Tracking Report. What is a Source System Tracking Report, and list the steps involved in the creation of a Source System Tracking Report.

When participating in the ETL efforts associated with Data Warehouse System development, a typical step is to develop the Logical Data Map. What is a Logical Data Map, and list the steps involved in the creation of a Logical Data Map.

Explanation / Answer

Cloud Computing Deployment Models:-

1. PUBLIC CLOUD : Features--> Reasonable level of Security, easy to implement, low operational cost.

2. PRIVATE CLOUD: Features--> Maximum level of reliability and scalability, users get both computational resources and network access.

3.HYBRID CLOUD: Features:--> Manage the vulnerabilities found in critical data applications.

Security Issues and Threat :

Data Breach: It might be targeted attack,application vulnerabilities,poor security practices.

Insecure Interfaces and APIs : Provisioning management and monitoring are done with these interfaces and security and availability of cloud services depend on the security of APIs. So these should be designed well to protect against malicious attempts.

System vulnerabilities: Vulnerabilities within the components of OS(Operating System) put the services and all data at risk.

Account hijacking : Attackers can misuse the user's information by hacking him.

Data Loss: Data can be lost for reasons other than malicious attacks. Data deletion by cloud service provider, a physical catastrophe like earthquake can lead to loss of data.

PaaS : It targets towards the developer as it allows them to build applications and services over the internet.

It is hosted over the cloud, and their users can access it through a web browser. The highlight of PaaS is that it can support the entire web app development cycle, right from building and testing to final deploying, managing and updating. Businesses can requisition resources for scaling as their demand grows, so they don’t have to invest in hardware anymore.

Example: Windows Azure,

SaaS : Software as a Service (SaaS) is perhaps the most commonly used cloud deployment model.

The vendors themselves manage the service, including applications, data, middleware, runtime, server, storage, virtualization, networking and even the Operating systems.

Examples: Google Apps

IaaS:

Infrastructure as a Service or Cloud Infrastructure is a self-service codes that aims to manage and monitor remote datacenter infrastructure for the following functions: compute, storage and networking.

The cloud service provider delivers the infrastructure while the clients get to decide on the operating system of their choice. They scale and spin virtual machines of their choice.

Example: Google Compute Engine.

DRaaS: Disaster recovery as a service (DRaaS) is a cloud computing and backup service model that uses cloud resources to protect applications and data from disruption caused by disaster. It gives an organization a total system backup that allows for business continuity in the event of system failure.

Economic Benefits Of Cloud Computing:

1: Elimination of CAPEX and improvements in agility: migrating to the cloud eliminates CAPEX and replaces these major up-front costs with predictable and manageable OPEX. This transition is crucial as it lowers the risk associated with strategic IT projects, so keeping business agile by allowing for more experimentation and entrepreneurship.

2: Scale as Needed – As your applications grow, you can add storage, RAM and CPU capacity as needed.

3: Resiliency and Redundancy One of the benefits of a private cloud deployment is that you can get automatic failover between hardware platforms and disaster recovery services to bring up your server set in a separate data center should your primary data center experience an outage.

4: Lower Maintenance Costs driven by 2 factors: Less hardware and outsourced, shared IT staff. Because cloud computing uses fewer physical resources, there is less hardware to power and maintain. With an outsourced cloud, you don’t need to keep server, storage, network, and virtualization experts on staff full time.

5:Lower Costs – Cloud computing pools all of the computing resources that can be distributed to applications as needed – optimizing the use of the sum of the computing resources and delivering better efficiency and utilization of the entire shared infrastructure.

Cloud Computing Definition: Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics, three service models, and four deployment models.

Enabling Technologies that have given rise to Cloud Computing:

1: Service Oriented Architecture (SOA) and Web Services SOA provides a general model for services interaction. It was invented to allow systems to interact with each other via services, even if the underlying implementations are very different. For example, the SOA interfaces in a system built on top of Microsoft .NET expose identical interface mechanisms and semantics as for example a J2EE implementation

2: Web Services Architecture and Main Actors In Web Services there are three main actors –a Service Consumer (seeking to use a service), a Service Registry (which will help the Service Consumer find a fulfilling Service), and a Service Provider, who has the service which the Service Consumer wants.

3: SOAP and Web Services Web Services can be implemented with a variety of service description languages and with a variety of communications protocols and conventions. The standards for Web Services call for specific implementations of this things.

Web Services Description Language (WSDL) provides platform independent description of services functionality. ? SOAP (Simple Object Access Protocol) is an XML protocol for Web Services. ? SOAP Message structure contains Message Header and Message Body

4: Representative State Transfer (REST) protocol REST is an Alternative for SOAP in Web Services. REST describes a set of architectural principles by which data can be transmitted over a standardized interface (such as HTTP). REST does not contain an additional messaging layer and focuses on design rules for creating stateless services.

5: Cloud Properties Enabled by Virtualization Virtualization is the catalytic element which enables much of breakthroughs attributed to Cloud Computing, In the following slides we are going to to speak about Cloud Properties Enabled by Virtualization ? Scalability--Virtual machine system automatic scale up ? Availability --Fault tolerant of hardware and software ? Manageability--Automatic physical to virtual system transformation ? Performance--Dynamically virtual machine level load balancing

Characteristics of Cloud Computing:

Critical Success Factors related to the evaluation of a Data Warehouse Project.

Sponsorship and Involovement.

Enterprise Architecture

Datawarehouse Technology