Using Python / Pandas, in a new Jupyter Notebook please do the following: 1.) Cr
ID: 3919467 • Letter: U
Question
Using Python / Pandas, in a new Jupyter Notebook please do the following:
1.) Create a print statement to output the number of data rows in the csv file. Format output as "File has %d rows".
2.) In a new cell, display a dataframe containing all series that is sorted by "Average Household Size" (high-to-low or low-to-high is fine)
3.) In a new cell, display a dataframe containing only the 'Zipcode' and 'Median Age' series, sorted by 'Median Age' low-to-high. (Hint: ascending = True parameter...)
4.) In a new cell, display the above data frame but filter out all 'Median Age' values that are less than 1. (We don't want the '0' cases)
5.) Finally, in a new cell, show the Zip Code, Total Population, and Total Households for the top ten zip codes ranked by Total Population.
CSV FILE https://files.fm/u/v6y96f9g
Explanation / Answer
#import csv
import csv
# create filename
nameoffile = "ela.csv"
# initializing firlds and rows
data = []
rows = []
# reading filename
with open(nameoffile, 'r') as csvfile:
#reader object creation
reader = csv.reader(csvfile)
# extracting field names through first row
data = reader.next()
# extract row one by one
for row in csvreader:
rows.append(row)
# get total number of rows
print("Total no. of rows: %d"%(reader.numberrows))
# printing the data
print('datas are:' + ', '.join(data for data in datas))
# printing first 5 rows
print(' First 5 rows are: ')
for row in rows[:5]:
# going through the column
for col in row:
print("%10s"%col),
print(' ')
// to add the columns in pandas
#import csv
import csv
# create filename
nameoffile = "ela.csv"
# initializing firlds and rows
data = []
rows = []
# reading filename
with open(nameoffile, 'r') as csvfile:
#reader object creation
reader = csv.reader(csvfile)
# extracting field names through first row
data = reader.next()
# extract row one by one
for row in csvreader:
rows.append(row)
# get total number of rows
print("Total no. of rows: %d"%(reader.numberrows))
# printing the data
print('datas are:' + ', '.join(data for data in datas))
# printing first 5 rows
print(' First 5 rows are: ')
for row in rows[:5]:
# going through the column
for col in row:
print("%10s"%col),
print(' ')
#import the pandas library and aliasing as pd import pandas as pd datfr = pd.DataFrame() print datfr//datfr is dataframe
# Using the previous DataFrame, we will delete a column # using del function import pandas as pd du = {([4, 45, 20000], ['5', '35', '788890'],['6','89','899090'],); '[3, 34, 47778], age=['4', '45', '897888'],['6'69'898900']), 'zipcode' : size([5,6,7], code=['5','45','389902'],['2','21','677899')} df = pd.DataFrame(du,columns=['houseseize','Age','zipcode'])
print ("Our dataframe is:") print df # using del function print ("Deleting the first column using DEL function:") del df['averagehousehold'] print df to sort age in ascending >>> result = df.sort(['Age','hoseholdsize'], ascending=[1, 0]
// to filter the age equal to 0 df_filtered = df[df['age'] == 0]
// to add the columns in pandas
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.