Hello guys. I’ve watched many youtube tutorials, read many posts, but can’t figure how to load my data properly. I’m working with Driving Stereo Dataset. This is the structure of the dataset folder.
When i load it with ImageFolder, it come as an “array” with 11104 positions.
train_path = './train'
transformations = transforms.Compose([
transforms.Resize((400,800)),
transforms.ToTensor(),
transforms.Grayscale(1)])
train_dataset = torchvision.datasets.ImageFolder(root=train_path,transform=transformations)
When i print some info about the data, I get these results
print(len(train_dataset))
print(len(train_dataset[0]))
print(len(train_dataset[0][0]))
print(train_dataset[0][0].type())
print(train_dataset[0][0].shape)
11104
2
1
torch.FloatTensor
torch.Size([1, 400, 800])
And the related images are in that positions…
left_image = train_dataset[0][0].clone().detach()
plt.figure(figsize=(5,10))
plt.imshow(left_image[0],cmap='gray')
right_image = train_dataset[2776][0].clone().detach()
plt.figure(figsize=(5,10))
plt.imshow(right_image[0],cmap='gray')
disparity_map = train_dataset[5552][0].clone().detach()
plt.figure(figsize=(5,10))
plt.imshow(disparity_map[0],cmap='gray')
depth_map = train_dataset[8328][0].clone().detach()
plt.figure(figsize=(5,10))
plt.imshow(depth_map[0],cmap='gray')
how can i batch this data, using disparity map or depth map as the “label”, to use this as the expected output of a network that receives left and right images as input?
here is the link of my notebook
https://colab.research.google.com/drive/107f8365tHXOZDPp6z6O7QxZF-oVCRShn?usp=sharing
thank you so much!