Skip to content

Conversation

gpetho
Copy link

@gpetho gpetho commented Jul 30, 2024

There is a bug in the score() method's code: A batch_size attribute can be specified for the BERTScorer object, but this setting is ignored by the score() method currently. In fact this attribute of the BERTScorer object has no effect at all.

The proposed fix is that if a batch_size argument is not specified explicitly for the score() method, then the value of the BERTScorer object's batch_size attribute is used for calculating the embeddings.

Another possibility would be to remove the batch_size attribute and, along with it, the batch_size parameter of the BERTScorer class and its initializer method respectively, just like it was apparently done for verbose and return_hash earlier (I fixed the docstrings and removed these outdated parameters), but this would modify the interface of the class and thus break backward compatibility, so I'd rather not do that. I believe it's better to fix the bug instead.

Note: This bug is no joke. I just lost some 3 or 4 hours of time because of it, trying to understand why BERTScore keeps running out of GPU RAM despite my having set batch size to 1, before it occurred to me that I should check if there's maybe some bug in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant