Skip to content

Bumblebee - Rachana P #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

rachanapatel
Copy link

No description provided.

Copy link

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a lot of good ideas here! Please review my comments, and let me know if you have any questions.

adagrams/game.py Outdated
@@ -1,13 +1,127 @@
from random import randint

def draw_letters():
pass
letter_dist: list[dict] = [{"A":9}, {"B":2}, {"C": 2}, {"D": 4}, {"E": 12}, {"F": 2},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what prompted you to include type hints. Have you run across them in your own studies? We will not be covering type hints in the curriculum aside from a brief tangent in Unit 2, so while you may feel free to practice using them in individual work, please do not use type hints in any group work.

But in general, I'd recommend setting type hints aside for the time being. The curriculum work was not created with type hinting in mind, and so you may find yourself fighting with the type hinting to get things to work properly more than working on the actual project materials.

For Python in particular, type hints are somewhat problematic. They can help our editors provide better suggestions, but Python essentially ignores the type hints when the code is actually running, so it doesn't natively provide any additional safety. To get the most out of type hints, we need to run external programs, such as mypy to confirm that we are following the type hints properly. To make things more confusing, different type checking utilities can report different issues with our type hints.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a variable that shouldn't change, especially one that's global, prefer to use CONSTANT_CASE.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you're using a list as the main structure so that you are able to access a random entry by its index, since we are only picking random numbers. Each entry is serving to associate a letter (tile) with its available count, and there is only one letter-count pair in each position.

As a result, consider using a nested tuple or list rather than a nested dict. If we don't know the keys in a dictionary, it can be cumbersome to retrieve the values from it, but if we use a pair, then we would know that the letter is position 0 and the count is in position 1. Using dictionaries when there is only a single pair in each is less well-suited to this data arrangement.

LETTER_COUNTS = [("A", 9), ("B", 2}, ("C", 2), ..., ("Z", 1)]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, came across type hinting online and thought it could be useful for a reader but I see what you mean

adagrams/game.py Outdated
generated_dict = {}
#use randint to access a random dictionary in list letter_dist
while len(generated_dict.keys()) < 10:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 This will loop until 10 disctinct letters have been chosen, no matter how many times each letter was chosen, resulting in a hand of 10 distinct letters rather than 10 letters, some of which may be duplicated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's unnecessary to do len of the keys collection. We will get the same result as getting the len of the entire directory.

Though the key issue here is that using the length of the dictionary will result in the tile distribution issue mentioned above, so I'm just mentioning this as a general point, not to imply that changing how the length of the dictionary is being determined is the issue at hand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the while loop to rely on a counter that increases when a valid letter is selected and added to the dictionary, rather than the length of the dictionary

Comment on lines +25 to +26
for char, max_allowed in char_max_dict.items():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop and the use of items here is a direct result of storing the letter and count information in a dict. Even though we know there's only one pair in each dict, we don't have a ready way of getting that specific pair. If we instead stored tuples with the letter and count information as described above, these lines could become

        char, max_allowed = letter_dist[randint(0, len(letter_dist)-1)]

adagrams/game.py Outdated
Comment on lines 27 to 32
random_letter = char
generated_dict[random_letter] = 1
else:
if max_allowed > generated_dict[random_letter]:
generated_dict[random_letter] += 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a dictionary with a value that can track the used tile count is a great idea! But what if we also kept a count of the total tiles so that we knew when we had picked 10 total tiles, as well as how many of each tile had been chosen? Could we use that information to stop looping once we had 10 total tiles?

adagrams/game.py Outdated
generated_dict[random_letter] += 1

return generated_dict.keys()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, returning the key will end up returning 10 unique tiles (preventing us from getting, say, 3 Es, which should be possible). Could we process the dictionary count information to produce a list of the picked tiiles with the proper number of each that were picked?

adagrams/game.py Outdated
Comment on lines 106 to 109
shortest_len_word = tied_words
winner = shortest_len_word
break

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 This condition is close to what we need, but it could lead to the selection of the incorrect word. The shortest word could appear anywhere in the list, and there could be any number of word lengths from 9 (10 would land in the previous condition) on down. This means that we might need to update the shortest word variable multiple times as we move through the list, but this logic will allow the shortest word to be updated only once before it assumes it has found the winner.

For example, given the "words" "AAAA", "DD", and "F" (all are score 4), the code would find "DD" to be the winner rather than "F" which is shorter.

adagrams/game.py Outdated
if len(ties) > 0:
winning_word = tiebreaker(ties)
winning_score = score_tracker[winning_word]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 If we're in a tiebreaking condition, winning_score should already have been set during the initial tie detection, meaning we shouldn't need to refresh the value here. If this was found to be necessary, then there could be a disconnect between determining the case of a lone winning word versus a tie.

Is it possible that the tiebreaking code code be running even when a singular winner comes along after the tie has been detected , meaning we no longer need to break the tie?

Comment on lines +80 to +82
winning_score = points
winning_word = word_item

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 When we find a word with a new best score, this will still leave the ties list intact. If a better score comes along, then that means anything that we previous thought was a tie is no longer a tie. How can we stop tracking those words that were previously considered to be ties?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can clear out the ties list

adagrams/game.py Outdated

def get_highest_word_score(word_list: list) -> tuple[str, int]:
score_tracker: dict[str, int] = {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code does use the score_tracker to look up a score using the word as a key later on, but I don't think that should be necessary. Instead, we have a data structure that is primarily being used to iterate through some pairs, so this could be a list of nested tuples or lists rather than a dictionary. Prefer to use dictionaries when key-value access is the primary use of the data structure.


# get highest value or add values to ties list
for word_item, points in score_tracker.items():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍Great idea to have this logic focus on looking for scores and leaving the tiebreaking until later.

There are still some tricky points that we need to consider here, especially around building up (and potentially clearing) our list of ties as we continue to find new scores in the data.

We might take this a step further and have one initial loop that just finds the best score, then a second loop that just finds all the words having that specific score (rather than having a changing idea of what the best score is as we iterate). Then finally apply our tiebreaking logic if necessary.

It might feel like an approach that does everything in fewer loops will be more efficient, but our intuition here can be deceiving. We'll begin to see with Big O analysis that code that looks like it has to do more may be equivalent to (or even more efficient than) an all-in-one approach!

Copy link

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updates address the possible tiles in draw_letters, but there is still a small issue with get_highest_word_score that can lead to picking the wrong word under very particular conditions. Please take another look at get_highest_word_score.

if max_allowed > generated_dict[char]:
generated_dict[char] += 1
count += 1
generated_hand = [k for k, v in generated_dict.items() for _ in range(v)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of a comprehension to expand the letters and counts out to a list containing the appropriate number of copies of each.

Consider using more descriptive variables in the comprehension, such as

    generated_hand = [letter for letter, count in generated_dict.items() for _ in range(count)]

which would help describe a little bit more what's going on. Further, consider moving that to a helper function and calling it here, since this is a little difficult to read as a comprehension.

# elsewhere
def expand_letter_counts_to_hand(leter_count_dict):
    return [letter for letter, count in leter_count_dict.items() for _ in range(count)]

    # then here
    generated_hand = expand_letter_counts_to_hand(generated_dict)

This tells us what's going on without the need to decipher the comprehension's intent. but if we need the details, we can go look in the function.

adagrams/game.py Outdated
Comment on lines 86 to 87
if len(ties) > 0:
ties.clear

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 As written, this code has no effect on the tallying of the ties when a higher score word is found which still results in possible downstream problems. For instance, if I pass

["AAAAAAAAAA", "BBBBBB", "BBBBBBB", "GGGGGGG"]

for the words, I should get "BBBBBBB" as the winner (score 29), but I get "AAAAAAAAAA" (score 18) as the winner.

The fix for this is small and on this line, but I also encourage you to try tracing through how this particular set of inputs results in a failure.

if len(tied_words) < len(shortest_word):
shortest_word = tied_words
winner = shortest_word
# break

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to remove logic over commenting it out. Commented logic left in code has a tendency to become uncommented as later devs come along and may get confused about whether the code was intentionally commented.

Copy link

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating get_highest_word_score.

if len(ties) > 0:
ties.clear
ties.clear()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yep. We needed to call the method. This resets the ties that we're tracking to only reflect the new highest score.

Notice that Python didn't give a squeek about us referencing a method without calling it. It was perfectly content to let us retrieve the method reference, and then basically throw it away without using it. So we need to be really careful of this in general. Adding additional tests (such as for the scenario I pointed out in my previous comment) that depend on the function being properly called can help us notice this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants