Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Q#2: There is no doubt that the most important runtime component of a scalable R

ID: 3881189 • Letter: Q

Question

Q#2: There is no doubt that the most important runtime component of a scalable RDBMS is a totally automatic query optimizer (no manual user intervention should be needed to optimize a query) – no one has the time to manually optimize all SQL statements of interest.

This question involves the Oracle “query optimizer” (aka qo).

Consider again schema SH.

A ) Browse the sales and customers tables and use a simple join to determine how many customers have between 8 and 50 sales transaction records (each record is a sales table row). Use a query such as this:

/* Determine range of customer targets for question (Q2) solution */

select count(*) , c.cust_last_name,c.cust_first_name,s.cust_id

from sh.sales s, sh.customers c

where                                                           < == You figure out the proper join condition that goes here

group by c.cust_last_name,c.cust_first_name,s.cust_id

having count(*) between 8 and 50

order by c.cust_last_name;

== > There are 142 such customers in SH.

B ) Randomly choose one of the 142 customers as your target data for parts B ) and C ).

Develop and then execute a query that retrieves the 4-tuples

customer first name, customer last name, cust_id, prod_id

Result 4-tuple rows must be sorted by prod_id.

Example partial solution NOT IN THE SOLUTION CUSTOMERS SPACE (74 rows) you must use:

CUST_FIRST_NAME        CUST_LAST_NAME   CUST_ID       PROD_ID

-----------------   --------------- -------      -------

        :                      :               :

Tony                 Damato         3096         19

Tony                 Damato         3096         31

Tony                 Damato         3096         40

Tony                 Damato         3096         40

Tony                 Damato         3096         40

Tony                 Damato         3096         46

       :                   :             :

74 rows selected.

Notice that PROD_ID can have duplicates (40 above). It means the customer has >1 sales row for the same item (PROD_ID)

C ) Using the SAME query you coded for part B ), generate the Oracle QEP (query execution plan). This displays the physical steps (aka plan) used by Oracle 11g to solve the query. We do NOT want to (re)-execute the query, so use the following syntax to generate the QEP:

set autotrace traceonly explain

     Your exact part B ) query, including the terminating “;” goes here

set autotrace off

The resulting QEP will look something like this (with Rows, Bytes, Cost and Time probably slightly different):

Execution Plan

----------------------------------------------------------

Plan hash value: 2294891630

-------------------------------------------------------------------------------------------------------------------------------------

| Id | Operation                  | Name      | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |

-------------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT           |           |    10 |   290 |   165   (2)| 00:00:02 |       |       |        |      |            |

|   1 | PX COORDINATOR            |           |       |       |            |          |       |       |        |      |            |

|   2 |   PX SEND QC (ORDER)       | :TQ10002 |    10 |   290 |   165   (2)| 00:00:02 |       |      | Q1,02 | P->S | QC (ORDER) |

|   3 |    SORT ORDER BY           |           |    10 |   290 |   165   (2)| 00:00:02 |       |       | Q1,02 | PCWP |            |

|   4 |     PX RECEIVE             |           |    10 |   290 |   164   (2)| 00:00:02 |       |       | Q1,02 | PCWP |            |

|   5 |      PX SEND RANGE         | :TQ10001 |    10 |   290 |   164   (2)| 00:00:02 |       |       | Q1,01 | P->P | RANGE      |

|* 6 |       HASH JOIN            |           |    10 |   290 |   164   (2)| 00:00:02 |       |       | Q1,01 | PCWP |            |

|   7 |        PX RECEIVE          |           |     1 |    20 |    28   (0)| 00:00:01 |       |       | Q1,01 | PCWP |            |

|   8 |         PX SEND BROADCAST | :TQ10000 |     1 |    20 |    28   (0)| 00:00:01 |       |       | Q1,00 | P->P | BROADCAST |

|   9 |          PX BLOCK ITERATOR |           |     1 |    20 |    28   (0)| 00:00:01 |       |       | Q1,00 | PCWC |            |

|* 10 |           TABLE ACCESS FULL| CUSTOMERS |    1 |    20 |    28   (0)| 00:00:01 |       |       | Q1,00 | PCWP |            |

| 11 |        PX BLOCK ITERATOR   |           |   918K| 8075K|   135   (1)| 00:00:02 |     1 |    28 | Q1,01 | PCWC |            |

|* 12 |         TABLE ACCESS FULL | SALES     |   918K| 8075K|   135   (1)| 00:00:02 |     1 |    28 | Q1,01 | PCWP |            |

-------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   6 - access("S"."CUST_ID"="C"."CUST_ID")

10 - filter("C"."CUST_FIRST_NAME"='Tony' AND "C"."CUST_LAST_NAME" LIKE 'Dam%')

12 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"S"."CUST_ID"))

Explanation / Answer

Answer.)

A. Query

select count(*) , c.cust_last_name,c.cust_first_name,s.cust_id

from sh.sales s, sh.customers c

where

s.cust_id=c.cust_id

group by c.cust_last_name,c.cust_first_name,s.cust_id

having count(*) between 8 and 50

order by c.cust_last_name;

B. Query

select c.cust_first_name,c.cust_last_name,c.cust_id, s.prod_id

from sh.sales s, sh.customers c

where

s.cust_id=c.cust_id

and c.cust_id=3096         

order by s.product_id;