Skip to content

Bumblebee - Rachana P #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 129 additions & 4 deletions adagrams/game.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,138 @@
from random import randint

letter_dist = [{"A":9}, {"B":2}, {"C": 2}, {"D": 4}, {"E": 12}, {"F": 2},
{"G": 3}, {"H": 2}, {"I" : 9}, {"J": 1}, {"K": 1}, {"L": 4},
{"M": 2}, {"N": 6}, {"O": 8}, {"P": 2}, {"Q": 1}, {"R": 6},
{"S": 4}, {"T": 6}, {"U": 4}, {"V": 2}, {"W": 2}, {"X": 1},
{"Y": 2}, {"Z": 1}]

score_chart = {
("A", "E", "I", "O", "U", "L", "N", "R", "S", "T"): 1,
("D", "G") :2,
("B", "C", "M", "P"): 3,
("F", "H", "V", "W", "Y"): 4,
("K"): 5,
("J", "X"): 8,
("Q", "Z"): 10
}


def draw_letters():
pass
#use dictionary generated_dict to keep track of the frequency of each letter
generated_dict = {}
#use randint to access a random dictionary in list letter_dist
count = 0
while count < 10:
char_max_dict = letter_dist[randint(0, len(letter_dist)-1)]
for char, max_allowed in char_max_dict.items():
Comment on lines +26 to +27

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop and the use of items here is a direct result of storing the letter and count information in a dict. Even though we know there's only one pair in each dict, we don't have a ready way of getting that specific pair. If we instead stored tuples with the letter and count information as described above, these lines could become

        char, max_allowed = letter_dist[randint(0, len(letter_dist)-1)]

if char not in generated_dict:
generated_dict[char] = 1
count += 1
else:
if max_allowed > generated_dict[char]:
generated_dict[char] += 1
count += 1
generated_hand = [k for k, v in generated_dict.items() for _ in range(v)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of a comprehension to expand the letters and counts out to a list containing the appropriate number of copies of each.

Consider using more descriptive variables in the comprehension, such as

    generated_hand = [letter for letter, count in generated_dict.items() for _ in range(count)]

which would help describe a little bit more what's going on. Further, consider moving that to a helper function and calling it here, since this is a little difficult to read as a comprehension.

# elsewhere
def expand_letter_counts_to_hand(leter_count_dict):
    return [letter for letter, count in leter_count_dict.items() for _ in range(count)]

    # then here
    generated_hand = expand_letter_counts_to_hand(generated_dict)

This tells us what's going on without the need to decipher the comprehension's intent. but if we need the details, we can go look in the function.


return generated_hand



def uses_available_letters(word, letter_bank):
pass
word = word.upper()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're pretty sure the tiles will always be upper case (since we are the ones who provide for the original tile drawing and the tiles are all uppercase), but the word input comes from a player and could be in lowercase. There's an argument to be made that perhaps there should be another step which "normalizes" the user input to a collection of valid tiles, but the project provided tests with lowercase words, so we need to handle this case. Uppercasing the word ensures that the letters within it should match our tile letters (assuming they are valid letters to begin with).

Note that due to the short expected length of the guessed word (it should be no larger than the hand size), uppercasing the entire word in one go should pose no problems. However, for arbitrary data, we might choose to uppercase the individual letters as we are iterating over them. When we discuss Big O, we'll see that uppercasing a string creates a copy of the entire string all at once, meaning that the amount of memory our program uses is proportional to the length of the word. If we uppercase a single letter, then there is only one letter's-worth of a copy in use at a time, meaning our memory uses doesn't depend on the length of the word.

Now here, we need to use additional memory for frequency maps anyway, so this is a superfluous concern, but these are the sorts of things we'll be wanting to think about and understand why we should or shouldn't worry about them, going forward.

word_list = []

for char in word:
word_list.append(char)
Comment on lines +45 to +46

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to repackage a string into a list, we can do so as

     word_list = list(word)

since list() iterates over its argument. Iterating over a string gives us its letters, one by one.

But also notice that the only reason we need to convert our word to a list is to conform to the type-hinting of the frequency map helper. We could adjust the hints to accept either a list of letters, or a string. While that might seem odd, the key realization is that really, whether working with a list of letters or a string, we are working with a sequence of letters. We can iterate and index into a string just as we can a list of letters, so having a function that can handle either is a good fit, at least in python. And this would remove the need for converting the word into a list entirely.


word_list_dict = create_letter_freq_dict(word_list)
bank_dict = create_letter_freq_dict(letter_bank)

for letter, freq in word_list_dict.items():
if letter not in bank_dict.keys() or freq > bank_dict[letter]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in for a dict checks the keys by default, s typically we don't explicitly list .keys() when checking whether something is in (or not in) a dictionary.

return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍As soon as we find one invalid letter, we know the whole word is invalid and we can return false!


return True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍If we get all the way through the word, meaning we found no invalid letters, then the whole word is valid!



def score_word(word):
pass
word = word.upper()
total_score = 0
if len(word) > 6:
total_score += 8
Comment on lines +61 to +62

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we could introduce CONSTANTs to make it more self-documenting what the purpose of these values is.

    # earlier
    BONUS_MIN_LENGTH = 7
    LENGTH_BONUS_POINTS = 8

    if len(word) >= BONUS_MIN_LENGTH:
        total_score += LENGTH_BONUS_POINTS

for letter in word:
for group, points in score_chart.items():
if letter in group:
total_score += points
Comment on lines +64 to +66

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 Notice that even though the letter score data is in a dictionary, due to how the data was placed in it, we need to iterate over the dictionary contents rather than performing a key lookup. Here, we could have placed the score content in a list of tuples and this code would be more or less unchanged, which indicates we're not using a dictionary to its strengths (key lookup).

The data value we're trying to find the score for is a single letter, but the keys in the dictionary are tuples of letters, meaning we can't know which key to use for our letter, which is why we end up needing to iterate through the whole dictionary, looking for the key which contains our letter.

Instead, if we're going to use a dictionary to hold our score data (which is a great idea), we should store the data in a way that makes use of the strengths of a dictionary. The way we store the data might not so closely resemble the way data is presented in a specification document (our project description), since that's written for a human audience. So if the item we're trying to locate in the score data is the score for a single letter, can we set up our dictionary with a different key-value relationship that better makes use of its strengths?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dictionary where the key-value pairs are the letter and respective store would've let us directly access score data


return total_score


def get_highest_word_score(word_list):
pass
score_tracker = {}
for given_word in word_list:
score_for_item = score_word(given_word)
score_tracker[given_word] = score_for_item

winning_score = -1
winning_word = ""
ties = []

# get highest value or add values to ties list
for word_item, points in score_tracker.items():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍Great idea to have this logic focus on looking for scores and leaving the tiebreaking until later.

There are still some tricky points that we need to consider here, especially around building up (and potentially clearing) our list of ties as we continue to find new scores in the data.

We might take this a step further and have one initial loop that just finds the best score, then a second loop that just finds all the words having that specific score (rather than having a changing idea of what the best score is as we iterate). Then finally apply our tiebreaking logic if necessary.

It might feel like an approach that does everything in fewer loops will be more efficient, but our intuition here can be deceiving. We'll begin to see with Big O analysis that code that looks like it has to do more may be equivalent to (or even more efficient than) an all-in-one approach!


if points > winning_score:
winning_score = points
winning_word = word_item
Comment on lines +84 to +86

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 When we find a word with a new best score, this will still leave the ties list intact. If a better score comes along, then that means anything that we previous thought was a tie is no longer a tie. How can we stop tracking those words that were previously considered to be ties?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can clear out the ties list


if len(ties) > 0:
ties.clear()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yep. We needed to call the method. This resets the ties that we're tracking to only reflect the new highest score.

Notice that Python didn't give a squeek about us referencing a method without calling it. It was perfectly content to let us retrieve the method reference, and then basically throw it away without using it. So we need to be really careful of this in general. Adding additional tests (such as for the scenario I pointed out in my previous comment) that depend on the function being properly called can help us notice this.


elif points == winning_score:
# add the earlier value to ties list first to preserve word order from word_list in ties list
ties.append(winning_word)
ties.append(word_item)


#use tiebreaker func if there are ties
if len(ties) > 0:
winning_word = tiebreaker(ties)
# winning_score = score_tracker[winning_word]

return (winning_word, winning_score)


def tiebreaker(tied_list):
# shortest_len_word = None
shortest_word = tied_list[0]

for tied_words in tied_list:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer a loop control variable with a singular name, like tied_word (or even just word since all the words in this function are implicitly tied). Using a plural makes it sound like tied_words is holding a collection of words, which becomes confusing when reading the code below.

# shortest_word = None

#even if there are multiples w length of 10, the first in the list will be the winner
if len(tied_words) == 10:
winner = tied_words
break
Comment on lines +113 to +115

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 As soon as we find a 10-letter word (since these were inserted in the scoring order) we know that we've found the best word in the tie.

else:
if len(tied_words) < len(shortest_word):
shortest_word = tied_words
winner = shortest_word
# break

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to remove logic over commenting it out. Commented logic left in code has a tendency to become uncommented as later devs come along and may get confused about whether the code was intentionally commented.

if len(tied_words) == len(shortest_word):
if tied_list.index(tied_words) < tied_list.index(shortest_word):
winner = tied_words
else:
winner = shortest_word
return winner



def create_letter_freq_dict(letter_bank):
letter_count = {}
#store the letters in given list as a dictionary w frequency
for elem in letter_bank:
if elem in letter_count.keys():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in for a dict checks the keys by default, s typically we don't explicitly list .keys() when checking whether something is in a dictionary.

letter_count[elem] += 1
else:
letter_count[elem] = 1
return letter_count
32 changes: 32 additions & 0 deletions tests/test_wave_04.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,3 +118,35 @@ def test_get_highest_word_does_not_return_early_after_first_tiebreaker():
# Assert
assert best_word[0] == "JQ"
assert best_word[1] == 18


def test_check_many_ties_returns_shortest_unsorted():
# Arrange
words = ["AAAA", "DD", "F"]

# Act
best_word = get_highest_word_score(words)

# Assert
assert best_word[0] == "F"
assert best_word[1] == 4

def test_check_many_ties_returns_shortest_unsorted_again():
# Arrange
words = ["BCD", "DDDD", "J", "FF"]

# Act
best_word = get_highest_word_score(words)

# Assert
assert best_word[0] == "J"
assert best_word[1] == 8

def test_check_case_best_word_not_in_ties():
words = ["AAAAAAAAAA", "BBBBBB", "BBBBBBB", "GGGGGGG"]
# Act
best_word = get_highest_word_score(words)

# Assert
assert best_word[0] == "BBBBBBB"
assert best_word[1] == 29