Assignment Background percent\" referring to the people whose income is in the O
ID: 3903166 • Letter: A
Question
Assignment Background percent" referring to the people whose income is in the One commonly hears reference to "the one top 1% of incomes. What is the data behind that number and where do others fall? Using the National Average Wage Index (A WI), an index used by the Social Security Administration to gauge a individual's earnings for the purpose of calculating their retirement benefit, we can answer such questions In this project, you will process AWI data. Example data for 2014 and 2015 is provided in the files year2014. txt and year2015 .txt (2015 ?s the most recent year of complete data-the 2016 data isn't available until October). The data is a table with the first row as the title and the second row defining the data fields; remaining rows are data. Note that the 2014 data is nicely formatted in columns, but the 2015 data is not. The URL for the data is: htts:/www.ssa sovicgi 201 Here is the second line of data from the file followed by descriptions of the data. Notice that some data are ints and some are floats: ,000.00-9,999.99 13,848,841 36,423,281 23.02549 102,586,913,092 61 7,407.62 Column 0 is bottom of this income range Column 1 is the hyphen separating the bottom of the range from the top Column 2 is the top of this income range Column 3 is the number of individuals in the income range. Column 4 is the cumulative number of individuals in this income range and all lower ranges. Column 5 is the Column 4 value represented as a cumulative percentage of all individuals. Column 6 is the combined income of all the individuals in this range of income. Column 7 is the average income of individuals in this range of incomeExplanation / Answer
import pylab
MIN_YEAR = 1990
MAX_YEAR = 2014
PLOT_RANGE = 40
def open_file():
prompt = 'Enter a year where ' + str(MIN_YEAR) + ' <= year <= ' + str(MAX_YEAR) + ': '
year = int(input(prompt))
if (year in range(MIN_YEAR, MAX_YEAR + 1)):
filename = 'year' + str(year) + '.txt'
try:
return [open(filename, 'r'), year] # Return file pointer and year
except:
print('Error in file name: ' + filename + '. Please try again. ')
open_file()
else:
print('Error in year. Please try again. ')
open_file()
def read_file():
[fp, year] = open_file() # Open file
data_list = [] # Initialize data structure
for line in fp: # Loop through each line of the file
line = line.split() # Split by spaces
line.pop(1) # Remove hyphen for range
data_list.append(line) # Add to data structure
data_list = data_list[2:len(data_list)] # Remove header and return list
data_list[len(data_list) - 1][1] = '-1'
for line in data_list: # Convert each entry to numbers
for i in range(7):
line[i] = line[i].replace(',', '')
if i in [2, 3]:
line[i] = int(line[i])
else:
line[i] = float(line[i])
data_list[len(data_list) - 1][1] = float('Inf') # Replace 'over' by infinity
return [data_list, year] # Return data structure and year
def get_range(data_list, percent):
for line in data_list:
if line[4] >= percent: # Check percentage is >= to percent
return ([line[0], line[1]], line[4], line[6]) # Return tuple
def get_percent(data_list, income):
for line in data_list:
if income in range(int(line[0]), int(line[1]) + 1): # Check if income is in range
return ([line[0], line[1]], line[4], line[6]) # Return tuple
def find_average(data_list):
num = sum(line[5] for line in data_list) # Sum all entries for total salary
denum = data_list[len(data_list) - 1][3] # Get total number of individuals
average = int(num / denum)
average = '{:,}'.format(average) # Format with ','
return average
def find_median(data_list):
for line in data_list:
if line[4] < 50: # Find values (closest below and above 50)
bot = line
else:
top = line
break
if abs(bot[4] - 50) < abs(top[4] - 50): # Check which one is closest to 50
return '{:,}'.format(bot[6])
else:
return '{:,}'.format(top[6])
def do_plot(x_vals, y_vals, year):
pylab.xlabel('Income')
pylab.ylabel('Cumulative Percent')
title = 'Cumulative Percent for Income in ' + str(year)
pylab.title(title)
pylab.plot(x_vals, y_vals)
pylab.show()
# Main program starts here
def main():
[data_list, year] = read_file()
print(' For the year ' + str(year) + ':')
avg = find_average(data_list) # Get average
med = find_median(data_list) # Get median
print('The average income was $' + str(avg))
print('The median income was $' + str(med) + ' ')
x_vals = [line[0] for line in data_list][:PLOT_RANGE] # Fetch x values
y_vals = [line[4] for line in data_list][:PLOT_RANGE] # Fetch y values
do_plot(x_vals, y_vals, year)
cmd = input('Enter a choice to get (r)ange, (p)ercent, or nothing to stop: ')
while (cmd != ''):
if cmd == 'r':
pct = input('Enter a percentage: ')
inc = get_range(data_list, float(pct))[0][1]
print(str(pct) + '% of incomes are below $' + str(inc + 0.01) + '. ')
elif cmd == 'p':
inc = input('Enter an income: ')
pct = get_percent(data_list, int(inc))[1]
print('An income of $' + str(inc) + ' is in the top ' + str(pct) + '% of incomes. ')
else:
print('Invalid command. Try again. ')
cmd = input('Enter a choice to get (r)ange, (p)ercent, or nothing to stop: ')
main() #Runs Program
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.