Create an algorithm that compares timings for functions and determines whether o
ID: 3746986 • Letter: C
Question
Create an algorithm that compares timings for functions and determines whether or not a string is an anagram:
INSTRUCTIONS (Please read through all of this)
-The program should have anagram Solution1, 2, 3, and 4 represent these four strategies as included; checking letters off from a first string that occur within the second string; sorting and then comparing the two strings; comparing the first string with each possible permutation of the letters in the second string; and counting the frequency of each letter in each string.
-Write sentences and paragraphs in your .txt or .docx discussion file, your discussion should be no longer than one page long.
Also included in the instructions:
If you're going to help, PLEASEEE explain what your code is doing with #comments in the lines please!!!
https://pastebin.com/Sk2Jdyfw
import random #stringGenerator.py
import string
import time
from itertools import permutations
def anagramSolution1(s1,s2):
alist = list(s2)
pos1 = 0
stillOK = True
while pos1 < len(s1) and stillOK:
pos2 = 0
found = False
while pos2 < len(alist) and not found:
if s1[pos1] == alist[pos2]:
found = True
else:
pos2 = pos2 + 1
if found:
alist[pos2] = None
else:
stillOK = False
pos1 = pos1 + 1
return stillOK
def anagramSolution2(s1,s2):
alist1 = list(s1)
alist2 = list(s2)
alist1.sort()
alist2.sort()
pos = 0
matches = True
while pos < len(s1) and matches:
if alist1[pos]==alist2[pos]:
pos = pos + 1
else:
matches = False
return matches
#anagramSolutions3 is not implemented in Miller notes
def anagramSolution3(s1,s2):
perms = [''.join(p) for p in permutations(s2)]
found=False
for s in perms:
if s1 == s:
found = True
break
return found
def anagramSolution4(s1,s2):
c1 = [0]*26
c2 = [0]*26
for i in range(len(s1)):
pos = ord(s1[i])-ord('a')
c1[pos] = c1[pos] + 1
for i in range(len(s2)):
pos = ord(s2[i])-ord('a')
c2[pos] = c2[pos] + 1
j = 0
stillOK = True
while j<26 and stillOK:
if c1[j]==c2[j]:
j = j + 1
else:
stillOK = False
return stillOK
def buildMyString(n):
s=""
for i in range(n):
aletter=random.choice(string.letters)
s=s+aletter
return(s)
#-- the root (main) program starts here
string.letters = 'abcdefghijklmopqrstuvwxyz'
n=11
s1=buildMyString(n)
if random.uniform(0.0,1.0)<.5:
r=random.randrange(0,n-1)
aletter = random.choice(string.letters)
s2=s1[0:r]+aletter+s1[r+1:n]
else:
s2=s1
print("s1 is: ",s1)
print("s2 is: ",s2)
print(" ...timing anagramSolution1 with n = ",n)
start=time.time()
print(anagramSolution1(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution2 with n = ",n)
start=time.time()
print(anagramSolution2(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution3 with n = ",n)
start=time.time()
print(anagramSolution3(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution4 with n = ",n)
start=time.time()
print(anagramSolution4(s1,s2))
print(time.time()-start)
Compare timings for function calls that determine whether strings are anagrams for various choices of string length n (say n 10, 100, 1000, etc) with anagramSolutions 1, 2, and 4 until "one gets into size trouble." Do this by commenting out the call to anagramSolution3. Execute multiple times for various n, particularly when n is a value that seems to be the size limit of n Subsequently, determine timings for various choices of n (say n 5, 10, 11, 12,...) in anagramSolution3 until one "gets into trouble." Again, execute multiple times for various n, particularly when n is a value that seems to be the size limit of n. Obviously, as the size of the strings, s1 and s2, gets larger than, say, 30 or 40, you will not want the program to print the strings.Explanation / Answer
The code that you have provided is already complete so, I am not able to understand what should be my new code and thus, what should I comment. So, I am trying to put comments in this code only that you have provided. And I am making the '.txt' discussion file for you.
COMMENTED CODE:
import random #stringGenerator.py
import string
import time
from itertools import permutations
## anagram solution 1:
# checks letters off from s1 that occur within s2
def anagramSolution1(s1,s2):
# make a list of all characters of s2
alist = list(s2)
# pos1 indicates which character of s1 is being accessed
pos1 = 0
# stillOK becomes false when no matching character was found in s2
stillOK = True
# loop through all characters of s1
# while we do not find a character that is not present in other string
# if we find such character, then there is no need to check further
while pos1 < len(s1) and stillOK:
# pos2 indicates which character of s2 is being accessed
pos2 = 0
# found becomes true is we find a similar character in s2
found = False
# check this char of s1 with each char of s2
while pos2 < len(alist) and not found:
# if found a similar character, set found to true
# else point to the next character of s2
if s1[pos1] == alist[pos2]:
found = True
else:
pos2 = pos2 + 1
# if a matching character is found, remove it from the list
# so that it does not get matched again
# else if no matching character was found, set stillOK to false
if found:
alist[pos2] = None
else:
stillOK = False
pos1 = pos1 + 1
return stillOK
## anagram solution 2:
# sort and then compare the two strings
def anagramSolution2(s1,s2):
# store the strings in lists and then sort them
alist1 = list(s1)
alist2 = list(s2)
alist1.sort()
alist2.sort()
# match each character of s1 to its corresponding character in s2
pos = 0
matches = True
while pos < len(s1) and matches:
# if characters match, check next character
# else, set matches to False
if alist1[pos]==alist2[pos]:
pos = pos + 1
else:
matches = False
return matches
#anagramSolutions3 is not implemented in Miller notes
## anagram solution 3:
# compare the first string with each possible permutation
# of the letters in the second string
def anagramSolution3(s1,s2):
# make a list of all permutaions of s2
perms = [''.join(p) for p in permutations(s2)]
found=False
# if any permutation is similar, set found to True
for s in perms:
if s1 == s:
found = True
break
return found
## anagram solution 4:
# count the frequency of each letter in each string
def anagramSolution4(s1,s2):
# make two lists of 26 elements
# because there are 26 letters in english alphabets
c1 = [0]*26
c2 = [0]*26
# for each character in s1
for i in range(len(s1)):
# find out the pos where it corresponds to in the list
pos = ord(s1[i])-ord('a')
# increase the count of pos
c1[pos] = c1[pos] + 1
# similar steps of s2
for i in range(len(s2)):
pos = ord(s2[i])-ord('a')
c2[pos] = c2[pos] + 1
# check both the lists
# if all the indices have same count, then s1 and s2 are anagrams
j = 0
stillOK = True
while j<26 and stillOK:
if c1[j]==c2[j]:
j = j + 1
else:
stillOK = False
return stillOK
def buildMyString(n):
s=""
for i in range(n):
aletter=random.choice(string.letters)
s=s+aletter
return(s)
#-- the root (main) program starts here
string.letters = 'abcdefghijklmopqrstuvwxyz'
n=11
# creates random string s1
s1=buildMyString(n)
# creates random string s2
# 50% of the times s1 is exactly same as s2
# 50% of the times s2 has one character that is different from s1
if random.uniform(0.0,1.0)<.5:
r=random.randrange(0,n-1)
aletter = random.choice(string.letters)
s2=s1[0:r]+aletter+s1[r+1:n]
else:
s2=s1
print("s1 is: ",s1)
print("s2 is: ",s2)
print(" ...timing anagramSolution1 with n = ",n)
start=time.time()
print(anagramSolution1(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution2 with n = ",n)
start=time.time()
print(anagramSolution2(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution3 with n = ",n)
start=time.time()
print(anagramSolution3(s1,s2))
print(time.time()-start)
print(" ...timing anagramSolution4 with n = ",n)
start=time.time()
print(anagramSolution4(s1,s2))
print(time.time()-start)
DISCUSSION.TXT
If we comment the anagramSolution3 and run anagramSolutions 1, 2 and 4, these are the
n = 10: All algorithms take 0.0 seconds
n = 100: All algorithms tae around 0.001 to 0.005 seconds
n = 1000: anagramSolution 1 takes 0.2 to 0.3 seconds. While anagramSolution 3 and 4 still take around 0.001 to 0.005 seconds
n = 5000: anagramSolution now takes more than 5-6 seconds while rest two algorithms take around 0.01 seconds.
n = 10000: anagramSolution now takes around half a minute but other two still take around 0.015 seconds.
If we comment all other algorithms except anagramSolution3, the program runs in around 3 seconds for n = 10 but we get into trouble for n = 11. The system starts lagging and we get no answer for more than 1 minute.
This happens because anagramSolution1 is an algorithm of O(n2) complexity. While anagramSolution2 and anagramSolution4 have a complexity of O(nlogn). This is the reason why their time does not increase with increase in n.
For anagramSolution3, the complexity is O(n!) which increases very rapidly with the increase in the value of n. Thus, this algorithm leads us to trouble very quickly.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.