whats wrong with this Python code !! # This program removes names and email addr
ID: 648303 • Letter: W
Question
whats wrong with this Python code !!
# This program removes names and email addresses occurring in a given input file and saves it in an output file.
import re
def deidentify():
infilename = input("Give the input file name: ")
outfilename = input("Give the output file name: ")
infile = open(infilename,"r")
outfile = open(outfilename,"w")
for line in infile:
detail=line.split(",")
name=detail[1];
email=detail[2];
# replace names
nameRE = "(Ms.|Mr.|Dr.|Prof.) [A-Z][a-z](.|+) [A-Z][a-z](|+) [A-Z][a-z](|+) [A-Z][a-z]"
# improve this regular expression
deidentified_name = re.sub(nameRE,"**name**",text)
print(deidentified_name,file=outfile)
# replace email addresses
emialRE = "Email id:"
deidentified_email = re.sub(emailRE,"**email**",mail)
print(deidentified_name,file=outfile)
outfile.close()
infile.close()
deidentify()
Explanation / Answer
As per my concern problem is with your regex expression
use this
If you use objects or dictionaries...this situatioin can handle easilt rather than re.sub
When I do this operations in java I usually i prefer objects which can handle easily
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.