An e-mail filter is planned to separate valid e-mails from spam. The word free o
ID: 3063683 • Letter: A
Question
An e-mail filter is planned to separate valid e-mails from spam. The word free occurs in p% of the spam messages and only n% of the valid messages. Also, m% of the messages are spam. (a) What is the probability of a randomly chosen message contains the word free? (b) What is the probability that in three incoming valid messages, one of them contains the word free? (c) If the e-mail filter categorised all e-mails containing the word free as spam, what is the probability that a spam is filtered correctly? (d) Given that an e-mail is categorised as spam, what is the probability that it is a false alarm?
Explanation / Answer
F: Free , S: Spam, V: Valid, P(F|S) =0.01p, P(F|V) = 0.01n
The word "free" occurs in p% of the spam messages, therefore P(F|S) =0.01p.
The word "free" occurs in n% of the valid messages, therefore P(F|V) =0.01n.
m% of the messages are spam, therefore P(S)=0.01m
Valid messages are P(V)=1-0.01m
a)The probability of a randomly chosen message contains the word free is
P(F) = P(F|S) P(S)+P(F|V) P(V) =(0.01p)*(0.01m)+(0.01n)*(1-0.01m)
P(F)=0.0001pm+0.01n-0.0001nm
b) The number of valid incoming messages is 3
Only n% of the valid messages contain the word free.
The probability that in three incoming valid messages, one of them contains the word free is
P(1 messages contain F|out of 3 V)=n% of 3=3*0.01n=0.03n
c)The probability that a spam is filtered correctly
P(S|F)=(P(F|S)*P(S))/P(F)=((0.01p)*(0.01m))/(0.0001pm+0.01n-0.0001nm)
d)Given that an e-mail is categorized as spam, the probability that it is a false alarm is
P(S|F')=(P(F'|S)*P(S))/P(F')
P(S|F')=((1-0.01p)*0.01m)/(1-(0.0001pm+0.01n-0.0001nm))
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.