Dataset from Kaggle not having the expected structure

Hi, I am new to pytorch and I am trying to use a dataset on Kaggle to train a neural network.

The dataset is a classic one (dogs vs cats) and can be found here: https://www.kaggle.com/c/dogs-vs-cats

However, when I download it and I do

data_dir = 'cat_dog_data/train'

transform = transforms.Compose([transforms.Resize(255),
                                transforms.CenterCrop(224),
                                transforms.ToTensor()])
dataset = datasets.ImageFolder(data_dir, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)

I get the following error:

RuntimeError: Found 0 files in subfolders of: PATHTOTHEFOLDER/intro-to-pytorch/cat_dog_data/train
Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp

It seems that if I remove the /train from the data_dir I don’t have the error. However, I think that this would be wrong since the real structure I should have is:

cat_dog_data/train/dogs
cat_dog_data/train/cats

Instead, I have something like this

cat_dog_data/train/cat.13.jpg
cat_dog_data/train/cat.11.jpg
...
cat_dog_data/train/dog.1.jpg
cat_dog_data/train/dog.21.jpg
...

Should I manually create the folders dogs cats and remove the dog. and cat. from the .jpg filename? This seems weird but I don’t understand how to proceed otherwise.

Thanks for your help.

It was as I thought, the dataset that I download was not correctly structured.

I found out with this issue that also gives a link to a correctly structured dataset.

I hope this will help someone else in my situation.