Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Answer the following questions. Upload/submit your written answers in MS-Word as

ID: 674173 • Letter: A

Question

Answer the following questions. Upload/submit your written answers in MS-Word as an attachment.

1. Define each of the following terms

a) file organization
b) sequential file organization
c) hashed file organization
d) denormalization
e) composite key
f) secondary key
g) data type

2. Using to Figure 6-11 below, write SQL data definition commands for each of the following queries:



a) How would you add an attribute, Class, to the Student table?
b) How would you remove the Registration table?
c) How would you change the FacultyName field from25 characters to 40 characters?

3. Using to Figure 6-11 below, write SQL commands for the following:

Figure 6-11

a) Create two different forms of the INSERT command to add a student with a student ID of 65798 and last name Lopez to the Student table.
b) Now write a command that will remove Lopez from the Student table.
c) Create an SQL command that will modify the name of course ISM 4212 from Database to Introduction to Relational Databases.

4. Define each of the following terms:

a) dynamic SQL
b) correlated subquery
c) embedded SQL
d) procedure
e) join

Explanation / Answer

File Organization defines how file records are mapped onto disk blocks. We have four types of File Organization to organize file records

Heap File Organization

When a file is created using Heap File Organization, the Operating System allocates memory area to that file without any further accounting details. File records can be placed anywhere in that memory area. It is the responsibility of the software to manage the records. Heap File does not support any ordering, sequencing, or indexing on its own.

Sequential File Organization

Every file record contains a data field (attribute) to uniquely identify that record. In sequential file organization, records are placed in the file in some sequential order based on the unique key field or search key. Practically, it is not possible to store all the records sequentially in physical form.

Hash File Organization

Hash File Organization uses Hash function computation on some fields of the records. The output of the hash function determines the location of disk block where the records are to be placed.

Clustered File Organization

Clustered file organization is not considered good for large databases. In this mechanism, related records from one or more relations are kept in the same disk block, that is, the ordering of records is not based on primary key or search key.

File Operations

Operations on database files can be broadly classified into two categories

Update operations change the data values by insertion, deletion, or update. Retrieval operations, on the other hand, do not alter the data but retrieve them after optional conditional filtering. In both types of operations, selection plays a significant role. Other than creation and deletion of a file, there could be several operations, which can be done on files.

It is one of the simple methods of file organization. Here each file/records are stored one after the other in a sequential manner. This can be achieved in two ways:

Inserting a new record:

In the diagram above, R1, R2, R3 etc are the records. They contain all the attribute of a row. i.e.; when we say student record, it will have his id, name, address, course, DOB etc. Similarly R1, R2, R3 etc can be considered as one full set of attributes

Inserting a new record:

In this method of file organization, hash function is used to calculate the address of the block to store the records. The hash function can be any simple or complex mathematical function. The hash function is applied on some columns/attributes – either key or non-key columns to get the block address. Hence each record is stored randomly irrespective of the order they come. Hence this method is also known as Direct or Random file organization. If the hash function is generated on key column, then that column is called hash key, and if hash function is generated on non-key column, then the column is hash column.

When a record has to be retrieved, based on the hash key column, the address is generated and directly from that address whole record is retrieved. Here no effort to traverse through whole file. Similarly when a new record has to be inserted, the address is generated by hash key and record is directly inserted. Same is the case with update and delete. There is no effort for searching the entire file nor sorting the files. Each record will be stored randomly in the memory.

These types of file organizations are useful in online transaction systems, where retrieval or insertion/updation should be faster.

   Now that we've invested a valuable chunk of our day in learning about normalization, it's time to introduce the concept of denormalization, which is exactly what it sounds like: decreasing a schema's level of normalization. As we've learned, normalization eliminates the data redundancy within a table, which greatly reduces the risk that data may become inconsistent. Why would one wish to reverse this process?

Normalization usually comes at a cost: speed of retrieval. Before normalization, if we wanted to know a donor's name, the dates of the donations, and the name of the project, it was all right there in one record for us to pick. After normalization, we have to go traversing through three or four tables for the same information. In most cases, the extra work is worth it, considering the benefits of data consistency and reduced storage usage. However, in a few rare cases, the speed of data retrieval is the factor that trumps all others. In large databases with complex schemas, one might sometimes require data from twelve or more tables in a single query, and the application may need to perform this type of query hundreds of times per minute. In such situations, a fully normalized database may be unacceptably slow.

Denormalization should not be done early, however. It is a last desperate resort that one should turn to only after exhausting all other options (like query optimization, improved indexing, and database system tuning, all of which will be discussed later in the book). Normally, follow the simple rule:

When in doubt, normalize.

One alternative to denormalizing the base tables (the tables that make up a database) is to create a separate reporting table so that the base tables are left unaffected. For example, suppose that in our previous example, we very frequently need to retrieve a donor's name, donation ID, and the date of the donation. The query often proves to be too slow in providing results. This may not seem realistic, given that it only involves two tables and any modern RDBMS would handle this with break-neck speed, but just use your imagination. We might be tempted to re-enter the donor's name to our Donation table:

DonationID

DonorID

Donor

Date

Amount

ProjectID

1

1

Marco Pinelli

13Dec2002

$200

1

2

2

Victor Gomez

15Dec2002

$100

2

3

2

Victor Gomez

15Dec2002

$100

3

4

2

Victor Gomez

15Dec2002

$50

4

5

3

Seung Young Lee

16Dec2002

$150

4

This is a heart-breaking departure from everything we've worked so hard to achieve. See the redundancy? See the wasted space?

With a separate reporting table, our three base tables of Donation, Donor, and Project remain beautifully normalized. The schema as a whole is not fully normalized, because the reporting table itself is redundant, but at least all of the redundancy is concentrated and isolated in one table, whose sole job is to provide quick access to data that comes from multiple sources. Thus in the Donor table we have:

DonorID

Name

Country

1

Marco Pinelli

Italy

2

Victor Gomez

United States

3

Seung Yong Lee

South Korea

The Project table is as follows:

ProjectID

Name

Description

1

Solar Scholars

Powering schools with solar panels

2

Pear Creek cleanup

Cleaning up litter and pollutants from Pear Creek

3

Danube Land Trust

Purchasing and preserving land in the Danube watershed

4

Forest Asia

Planting trees in Asia

The Donation table is as follows:

DonationID

DonorID

Date

Amount

ProjectID

1

1

13Dec2002

$200

1

2

2

15Dec2002

$100

2

3

2

15Dec2002

$100

3

4

2

15Dec2002

$50

4

5

3

16Dec2002

$150

4

The purpose of the three tables shown above is to properly store the organization's data in a way that is consistent and reliable. They are the core tables on which the organization's applications will be based. In contrast, the Report_DonorName_Date table below is solely designed to facilitate a specific report without involving the base tables. We have not abided strictly to normalization with this table in order to provide a single location where the most frequently requested data might be accessed quickly.

The Report_DonorName_Date table is as follows:

DonationID

DonorName

Date

1

Marco Pinelli

13Dec2002

2

Victor Gomez

15Dec2002

3

Victor Gomez

15Dec2002

4

Victor Gomez

15Dec2002

5

Seung Yong Lee

16Dec2002

A reporting table is usually used as a data cache - a place to store amalgamated or semi-amalgamated data for fast access, which reduces demand on the main location where the data are stored in a raw state. Depending on the business requirements of the application, it might even be possible to only fill theReport_DonorName_Date table periodically, say overnight when the system is least busy. Others among us are not quite as lucky, and have to ensure that the data in the reporting table is no older than ten minutes, or even ten seconds. Even then, the reporting table offers a performance advantage. It is better to query the base tables once every ten seconds than hundreds or thousands of times per minute. Triggers, which will be discussed in Chapter 7, can be useful in keeping a reporting table up-to-date.

Denormalization is not pretty, but it is often helpful. If you absolutely must do it, then do it; but make sure you feel guilty about it, just like the professionals. A simple rule for beginners to database design is never denormalize

E) Composite Key

A composite key consists of more than one attribute to uniquely identify an entity occurrence. This differs from a compound key in that one or more of the attributes, which make up the key, are not simple keys in their own right.

For example, you have a database holding your CD collection. One of the entities is called tracks, which holds details of the tracks on a CD. This has a composite key of CD name, track number.

CD name in the track entity is a simple key, linking to the CD entity, but track number is not a simple key in its own right

F) Secondary Key or Alternative Key

A table may have one or more choices for the primary key. Collectively these are known as candidate keys as discuss earlier. One is selected as the primary key. Those not selected are known as secondary keys or alternative keys.

For example in the table showing candidate keys above we identified two candidate keys, studentId and firstName + lastName. The studentId would be the most appropriate for a primary key leaving the other candidate key as secondary or alternative key. It should be noted for the other key to be candidate keys, we are assuming you will never have a person with the same first and last name combination. As this is unlikely we might consider fistName+lastName to be a suspect candidate key as it would be restrictive of the data you might enter. It would seem a shame to not allow John Smith onto a course just because there was already another John Smith.

g)data type

A data type is a type of data. Of course, that is rather circular definition, and also not very helpful. Therefore, a better definition of a data type is a data storage format that can contain a specific type or range of values.

When computer programs store data in variables, each variable must be assigned a specific data type. Some common data types include integers, floating point numbers,characters, strings, and arrays. They may also be more specific types, such as dates, timestamps, boolean values, and varchar (variable character) formats.

Some programming languages require the programmer to define the data type of a variable before assigning it a value. Other languages can automatically assign a variable's data type when the initial data is entered into the variable. For example, if the variable "var1" is created with the value "1.25," the variable would be created as a floating point data type. If the variable is set to "Hello world!," the variable would be assigned a string data type. Most programming languages allow each variable to store a single data type. Therefore, if the variable's data type has already been set to an integer, assigning string data to the variable may cause the data to be converted to an integer format.

Data types are also used by database applications. The fields within a database often require a specific type of data to be input. For example, a company's record for an employee may use a string data type for the employee's first and last name. The employee's date of hire would be stored in a date format, while his or her salary may be stored as an integer. By keeping the data types uniform across multiple records, database applications can easily search, sort, and compare fields in different records.

DonationID

DonorID

Donor

Date

Amount

ProjectID

1

1

Marco Pinelli

13Dec2002

$200

1

2

2

Victor Gomez

15Dec2002

$100

2

3

2

Victor Gomez

15Dec2002

$100

3

4

2

Victor Gomez

15Dec2002

$50

4

5

3

Seung Young Lee

16Dec2002

$150

4

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote