I created my first CNN using Microsoft’s Dogs vs Cats dataset. Both times when I’ve ran it, the training stops because of a corrupt image. I read the ImageFolder docs and see that is_valid_file is a function, but I have no idea how to create a function that would be able to do that. Can someone give me code that can do that? Thanks.
Once you’ve created the Dataset
, you could iterate it and use the index (and thus file path) to either fix the file (re-download, recreate) or remove it from the folder.
Also, what error message are you getting?
I have question why in torchvision 0.2.1 when I supply is_valid_file function there is error : “init() got an unexpected keyword argument ‘is_valid_file’”? I can see from doc the argument exist.
torchvision 0.2.1
was released on April 24, 2018 based on the release notes, while is_valid_file
was added on April 25 2019 in this PR (so a year later).
Where did you find the docs which mention it in 0.2.1
?
My bad. I have 0.2.1 in my system all this time, and I thought it already includes that missing argument. I just reinstall pytorch and torchvision, and there is no problem anymore.
Thanks for your time.
What does that mean? Can you show a sample code?
If i iterate my dataset i already get the error (caused by a corrput jpg file in my image folder) before i can do something in the for loop.
That’s the idea to debug the issue further. Once you’ve isolated the index, where the dataset
fails, check the corresponding file.
A code snippet would be:
for idx in range(len(dataset)):
try:
batch = dataset[idx]
except Exception as e:
print(idx)