Suppose a data file for Student (Name, SSN, Address, Phone, Email…). Given that

ID: 3916654 • Letter: S

Question

Suppose a data file for Student (Name, SSN, Address, Phone, Email…). Given that the data file’s record size is 400 bytes, disk block size is 1024 bytes, and total number of records are 3000000. For index entry, the field size is 9 bytes, and the pointer size is 8 bytes. Answer the following questions:

a. What will be the average linear search cost using data file?

b. What will be binary search cost using data file?

c. What will be binary search cost using index file (dense index)? Discuss why using indexes is almost always better.

Explanation / Answer

In our data we have block size is 1024Bytes and each record size is

400Bytes. we cant store 3 records in a same block because 400*3 = 1200Bytes

Now we strategies to store records in a block.

Strategies for storing a record file into blocks:

1.Spanned strategy: It allows partial part of record to be stored in a block.

here we store 1024/400 records in a block

i.e. 2 full records +partial 3rd record

Advantage : Space used efficiently..

Disadvantage: we need to search 2 blocks for 1 record sometimes.

It is suitable for variable length records.

2.UnSpanned Strategy:

No record can be stored in more than one block.

Here space was wasted.But number of block access reduced.

IT is Suitable for fixed length records.

Given Data:

Record Size =400Bytes

Block size =1024Bytes

Number of records is 3000000

In index

Key size = 9Bytes

Block pointer size = 8Bytes

1.Linear Search:

In Linear search we dont use index file so we linearly search on Data file.

Time complexity is depends on number of blocks in a data file.

Fixed record size so go for unspanned .

no of blocks = (Total Records )/(number ofrecords in a block)

= 30,00,000/2

= 15,00,000 Blocks

So we need to search 1.5 million blocks in linear search.

2.Binary Search:

To apply the binary search Reacords are must be in Sorting order.Otherwise

This strategy can't work.Assume it is in sorting Order Then Time complexity

is follows like this.

Here also we dont use the index file.We directly apply binary search on

data file.

Number of blocks need to search = Log2(number of blocks )

= Log2(15,00,000)

= 20.5165 = 21 block accesses.

3.Binary Search using Index file(dense):

In Index file we put Key value and Block pointer for each record.

Dense means for every key value one entry in index file.

So record size will be 9+8 =17Bytes

number of record entries in each block = 1024/17 = 60.23 = 60 record entrie per block

number of blocks in index file = 30,00,000/60 = 50000 blocks

We use binary search on this index file So number of block accesses is

= Log 2(50000) + 1 Here 1 for data file access.

= 16+1 = 17 block accesses.

Why Indexes is alway best:

Without index file we need to apply search on Data file here number of records

Block is very less . If we want to search on more records then we need to access

more blocks . It is costly.

If we have index file .Index file contain only key value and direct address of

That record address.So here number of records per block increase it reduces the number of block accesses.

Index file is already in sorted based on key values.

i.e. Indexes is always best.

Navigate

Suppose a data broker correctly identifies that your grandmother is addicted to

Suppose a dealer is receiving the floating rate and paying the fixed rate in a 2

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Suppose a data file for Student (Name, SSN, Address, Phone, Email…). Given that

Question

Explanation / Answer

Related Questions

Navigate