Lesson 4: bug report for accuracy and loss aggregation #778

frp · 2023-12-27T01:47:39Z

frp
Dec 27, 2023

The course overall is amazing, thanks for making it. I particularly liked the emphasis on constantly visualizing everything.

I did however find one bug that kinda messes up the understanding of the situation with the model quite a bit.

The bug

In the video notebook, you aggregate accuracy as follows:

train_acc += (y_pred_class==y).sum().item()/len(y_pred)

The problem with this approach is that the last batch is smaller than other batches, yet, with this calculation, weighs just as much in the total accuracy.

Specifically:

For training dataset, there are 7 batches of 32 images, and one batch with just 1. That one image contributes 32x more to the score than any other image.
For testing dataset, there are 3 batches, 2 of 32 images, and one with 11. Those images contribute 3x what they should.

The same considerations apply to loss, as the default mode for CrossEntropyLoss() in PyTorch, according to documentation, is mean.

In addition to the obvious - "the numbers are wrong" - this also makes loss and accuracy curves look much more all-over-the-place than they should.

Suggested fix: weighted sum, where each batch contribution is multiplied by len(y) / batch_size.

    weight = len(y) / dataloader.batch_size
    train_loss += loss.item() * weight

    acc = torch.eq(y_pred.argmax(dim=1), y).sum().item() / len(y)
    train_acc += acc * weight

Then in the end, when averaging, account for the fact that the last batch had a lower weight:

  dl_len = len(dataloader) - 1.0 + weight
  # Adjust metrics to get average loss and average accuracy per batch
  return train_loss / dl_len, train_acc / dl_len

Numbers "before" vs "after" demo:

Curve for 50 epochs before:

Curve for 50 epochs after:

I'd argue that this materially changes the interpretation of what is happening to the model here.

I did make sure to do the reproducible setup, on cpu, and triple-checked that it is in fact the same (that is, if I train the model twice and use the same aggregation for those numbers, the curves are exactly the same).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lesson 4: bug report for accuracy and loss aggregation #778

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Lesson 4: bug report for accuracy and loss aggregation #778

Uh oh!

frp Dec 27, 2023

The bug

Replies: 0 comments

frp
Dec 27, 2023