Search engines often index their collections of documents so that they can easil

ID: 3888608 • Letter: S

Question

Search engines often index their collections of documents so that they can easily return an ordered set of documents that contain a given query word, w. Such a data structure is known as an inverted file. In order to construct an inverted file, we might start with a set of n triples of the form, (w, d, r), where w is a word, d is an identifier for a document that contains the word w, and r is a rank score for the popularity of the document d. Often, the next step is to sort these triples, ordered first by w, then by r, and then by d. Assuming each of the values, w, d, and r, are represented as integers in the range from 1 to 4n, describe a linear-time algorithm for sorting a given set of n such triples.

Explanation / Answer

The algorithm used for this is called the radix sort.

Since, we already know the range of the elements, we can use this knowledge to make a linear time sorting algorithm specifically for this case.

ALGORITHM:

Step 1: Create an array of size same as the known range.

Step 2: Go through the values in w and insert them into the array created in step 1.

Example: wi is inserted into array[i]

Finally just read out the array from index 0 to index 4n leaving the gaps. This requires a total of n basic operations. So, time complexity is linear.

Step 3. Repeat step 2 with d and r respectively.

Finally we end up with sorted triplet.

Total complexity for each step is linear. So for 3 steps, it is still linear.

If you have any query, you can in the comment section.

Navigate

Search ecampus Login Web AMT U Haul. Point of Sale Log W Amerco Intranet Login b

Search fonums Discussion Posting: E- Learning Online courses are great - or are

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Search engines often index their collections of documents so that they can easil

Question

Explanation / Answer

Related Questions

Navigate