Hey guys,
I just started using PyTorch and have been going through a lot of tutorials/documentation recently.
I think I’ve found an error in a few places where you calculate stats like accuracy and loss.
To calculate most stats you divide some running metric by the data set’s size. There are around 3 ways I’ve found online to accomplish this (all from my experimentation so far have yielded incorrect results):
- len(data_loader) - when using a batch size greater than 1, this seems to return the batch size (not what we’re looking for)
- len(data_loader.dataset) - this works in some cases, however once you use a data sampler to create a dataloader which uses a smaller subset of your data this breaks down (as it doesn’t return the actual currently utilized data’s size)
- len(data_loader) * batch_size - this results in a slightly off number as PyTorch adjusts the batch size when the length of the dataset you’re working with doesn’t divide perfectly with your batch size (i.e. mod isn’t 0)
What I’ve found though was that using len(data_loader.sampler) WILL return precisely what you’re looking for (length of the training/validation data by itself). Just wanted to point this out as I’ve spent a huge amount of time looking for why my loss/accuracy were always incredably small.
I’m willing to show my code if needed.
Official tutorials/documentation:
https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html - first case
https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html - second case