Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Here is a Python dictionary of the relative frequency of letters in English text

ID: 3884856 • Letter: H

Question

Here is a Python dictionary of the relative frequency of letters in English text: { "A": .08167, "B": .01492, "C": .02782, "D": .04253, "E": .12702, "F": .02228, "G": .02015, "H": .06094, "I": .06996, "J": .00153, ""K": .00772, "L": .04025, "M": .02406, "N": .06749, "0": .07507, "P": .01929, "Q": .00095, "R": .05987, "S": .06327, "T": .09056, "U": .02758, "V": .00978, "W": .02360, "X": .00150, "Y": .01974, "Z": .00074 } Here is some plaintext: ethicslawanauniversitypolicieswarnlngtoderendasystemyouneedtobeabletot hinklikeanattackerandthatincludesunderstandingtechniquesthatcanbeusedt ocompromisesecurityhoweverusingthosetechniquesintherealworldmayviolate thelawortheuniversitvsrulesanditmaybeunethicalundersomecircumstancesev enprobingforweaknessesmayresultinseverepenaltiesuptoandincludingexpuls ioncivilfinesandjailtimeourpolicyineecsisthatyoumustrespecttheprivacya ndpropertyrightsofothersatalltimesorelseyouwillfailthecourseactinglawf ullyandethicallyisyourresponsibilitycarefullyreadthecomputerfraudandab useactcfaaafederalstatutethatbroadlycriminalizescomputerintrusionthisi soneofseverallawsthatgovernhackingunderstandwhatthelawprohibitsifindou btwecanreferyoutoanattorneypleasereviewitsspoliciesonresponsibleuseoft echnologyresourcesandcaenspolicydocumentsforguidelinesconcerningproper The population variance of a finite population X of size N and mean mu is given by Var(X) = 1/N Sigma^N_i = 1 (x_i - mu)^2. (a) What is the population variance of the relative letter frequencies in English text? (b) What is the population variance of the relative letter frequencies in the given plaintext? (c) For each of the following keys-yz, xyz, wxyz, vwxyz, uvwxyz-encrypt the plaintext with a Vigenere cipher and the given key, then calculate and report the population variance of the relative letter frequencies in the resulting ciphertext. Describe and briefly explain the trend in this sequence of variances. (d) Viewing a Vigenere key of length k as a collection of k independent Caesar ciphers, calculate the mean of the frequency variances of the ciphertext for each one. (E.g., for key yz, calculate the frequency variance of the even numbered ciphertext characters and the frequency variance of the odd numbered ciphertext characters. Then take their mean.) Report the result for each key in part (c). Is the mean variance like those observed in part (b)? Part (c)? Briefly explain. (e) Consider the ciphertext that was produced with key uvwxyz. In part (d), you calculated the mean of six variances for this key. Revisit that ciphertext, and calculate the mean of the frequency variances that arise if you had assumed that the key had length 2, 3, 4, and 5. Does this suggest a variant to the Kasiski attack? (Don't say no!) Briefly explain.

Explanation / Answer

In order to compute the population variance, we first have to find out the mean of the numbers (). Mean is just the average of the given data (i.e., sum of the numbers/ number of values). And now Variance is computed as the average of the squared differences from the Mean.

A. For the given data, the result is as follows

Mean is = 0.046185
Population Size = 26
Population Variance = 0.00227779297115

B. Frequency of each alphabet in the given text is
{'A': 64, 'C': 38, 'B': 11, 'E': 102, 'D': 28, 'G': 12, 'F': 15, 'I': 68, 'H': 27, 'K': 5, 'J': 1, 'M': 15, 'L': 44, 'O': 53, 'N': 62, 'Q': 2, 'P': 21, 'S': 62, 'R': 53, 'U': 40, 'T': 69, 'W': 13, 'V': 10, 'Y': 23, 'X': 1, 'Z': 1}

Mean is = 32.3076923077
Population Size = 26
Population Variance = 712.75147929

(Problem C&D) For these subparts, you can easily find out the reference programs online or can write yourself. After converting the string, just supply it to the below program as we did for problem A&B. If you find it hard, please mention in the comment, I will update the answer.

Below is the python program to compute the variance of the given population data
----------------------------------------------------------------------------------------------------------------

from math import pow, sqrt

def findMean(populationData, totalSize):
    if (totalSize <= 0):
        return;
    mean = 0.0
    for i in populationData:
        mean += populationData.get(i)
    return mean/totalSize


def calculateVariance(populationData):
    totalSize = len(populationData)
    mean = findMean(populationData, totalSize)

    #Variance = 1/N * summation( square(xi-mean) ), where i = 1 to N
    variance = 0.0;
    for key in populationData:
        variance += pow((populationData.get(key) - mean), 2)

    variance /= totalSize;
    standardDeviation = sqrt(variance)

    print " Mean is =", mean
    print "Population Size =", totalSize
    print "Population Variance =", variance
    print "Standard Deviation =", standardDeviation, " "


#To find out the frequency of each alphabet in the given text
def findFrequencyFromText(sampleText):
    alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    sampleText = sampleText.upper()
    populationData = {}

    #initialize each alphabet's frequency to zero
    for character in alphabets:
        populationData[character] = 0

    #count occurence of each character
    for character in sampleText :
        if populationData.has_key(character) :
            frequency = populationData.get(character) + 1
            populationData[character] = frequency

    return populationData


#Problem-1
print " Problem - 1"
populationData = {"A" : 0.08167, "B" : 0.01492, "C": 0.02782, "D":0.04253, "E":0.12702, "F":0.2228,
                  "G" : 0.02015, "H": 0.06094, "I":0.06996, "J":0.00153, "K":0.00772, "L":0.04025,
                  "M" : 0.02406, "N" : 0.06749, "O" :0.07507, "P" : 0.01929, "Q":0.00095, "R":0.05987,
                  "S" : 0.06327, "T" : 0.09056, "U" : 0.02758, "V" : 0.00978, "W" : 0.02360, "X":0.00150,
                  "Y" : 0.01974, "Z" : 0.00074}
calculateVariance(populationData)


#Problem-2
print " Problem - 2"
sampleText = "ethicslawanduniversitypolicieswarningtodefendasystemyouneedtobeabletothinklikeanattackerandthatincludesunderstandingtechniquesthatcanbeusedtocompromisesecurityhoweverusingthosetechniquesintherealworldmayviolatethelawortheuniversitysrulesanditmaybeunethicalundersomecircumstancesevenprobingforweaknessesmayresultinseverepenaltiesuptoandincludingexpulsioncivilfinesandjailtimeourpolicyineecsisthatyoumustrespecttheprivacyandpropertyrightsofothersatalltimesorelseyouwillfailthecourseactinglawfullyandethicallyisyourresponsibilitycarefullyreadthecomputerfraudandabuseactcfaaafederalstatutethatbroadlycriminalizescomuputerintrusionthisisoneofserallawsthatgovernhackingunderstandwhatthelawprohibitsifindoubtweecanreferyoutoanattorneypleasereviewitsspoliciesonresponsibleuseoftechnologyresourcesandcaenspolicydocumentsforguidelinesconcerningproper";
populationData = findFrequencyFromText(sampleText)
print "Frequency of each alphabet in the given text is ", populationData
calculateVariance(populationData)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote