Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Safari 9:56 As usual, please turn in your do-files, as well as your type-written

ID: 3668366 • Letter: S

Question

Safari 9:56 As usual, please turn in your do-files, as well as your type-written output. Part I: Merging Data This empirical exercise uses two data sets. One data set, "School_Building Data.dta," contains data on the number of schools that were built in a district in 1974 and the number of students that were enrolled in that district in 1971. The second data set, "Student Outcome_Data.dta," contains data on the educational outcomes of adults that were observed in 1995. As a first step for the analysis, use the merge command to combine the data sets at the district-level. (Hint: If your merge is successful, the_merge variable created by the merge process should only take the value 3.) Part 2: Preparing Data for Analysis Following Duflo (2001), calculate the age of each individual in 1974 (when the school construction took place). Create an indicator variable called post that is 1 if an individual was 2- 6 when the schools were built and 0 otherwise. Create a second variable called intensity that is the number of schools built in a district scaled by (number of students/100,000). Part 3: Run the 1st Differences-in-Differences Regression Following Duflo (2001), for all individuals who were aged 2-6 in 1974 or aged 12- 24, run the

Explanation / Answer

In order to understand match-merging, you must understand three key concepts:

is a variable named in a BY statement.

is the value of a BY variable.

is the set of all observations with the same value for the BY variable (if there is only one BY variable). If you use more than one variable in a BY statement, then a BY group is the set of observations with a unique combination of values for those variables. In discussions of match-merging, BY groups commonly span more than one data set.

For example, the director of a small repertory theater company, the Little Theater, maintains company records in two SAS data sets, COMPANY and FINANCE.

The following program creates, sorts, and displays COMPANY and FINANCE:

The following output displays the data sets. Notice that the FINANCE data set does not contain an observation for Michael Morrison.

The COMPANY and FINANCE Data Sets

To avoid having to maintain two separate data sets, the director wants to merge the records for each player from both data sets into a new data set that contains all the variables. The variable that is common to both data sets is Name. Therefore, Name is the appropriate BY variable.

The data sets are already sorted by NAME, so no further sorting is required. The following program merges them by NAME:

The following output displays the merged data set:

Match-Merging

Input SAS Data Set for Examples

The Little Theater has a third data set, REPERTORY, that tracks the casting assignments in each of the season's plays. REPERTORY contains these variables:

is the name of one of the plays in the repertory.

is the name of a character in Play.

is the employee ID number of the player playing Role.

The following program creates and displays REPERTORY:

The following output displays the REPERTORY data set:

BY variable

is a variable named in a BY statement.

BY value

is the value of a BY variable.

BY group

is the set of all observations with the same value for the BY variable (if there is only one BY variable). If you use more than one variable in a BY statement, then a BY group is the set of observations with a unique combination of values for those variables. In discussions of match-merging, BY groups commonly span more than one data set.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at drjack9650@gmail.com
Chat Now And Get Quote