I have started some experiments using the imagenet example in the pytorch examples distribution (branch 0.3.1). I downloaded and processed the data as instructed on GitHub - soumith/imagenet-multiGPU.torch: an imagenet example in torch. (unpacking the many folders inside ILSVRC2012_img_train.tar and running valprep.sh)
When training the model (e.g. AlexNet), a few times per epoch I will see warnings like:
(…) /pylocal/lib/python2.7/site-packages/PIL/TiffImagePlugin.py:756: UserWarning: Corrupt EXIF data. Expecting to read 4 bytes but only got 0.
warnings.warn(str(msg))
(…) /pylocal/lib/python2.7/site-packages/PIL/TiffImagePlugin.py:739: UserWarning: Possibly corrupt EXIF data. Expecting to read 2555904 bytes but only got 0. Skipping tag 0
" Skipping tag %s" % (size, len(data), tag))
Is this to be expected (i.e. some of the imagenet files just have bad EXIF data, and this shouldn’t interfere with training) ? Or does it suggest that my dataset is corrupt?
@ttb I have encountered the same problem with ImageNet data. Although I am training the model in tensorflow. But this seems to be a problem with data, maybe. And I would also like to know that will it interefere with training? Because I am experience the training sometimes gets stuck and I don’t see any progress.
import glob
import piexif
nfiles = 0
for filename in glob.iglob('~/ImageNet/**/*.JPEG', recursive=True):
nfiles = nfiles + 1
print("About to process file %d, which is %s." % (nfiles,filename))
piexif.remove(filename)