Hey guys, I’ facing an issue regarding my dataloader itself.
For some reason, one of the last entries in the dataset is having a channel size of 14 channels instead of the 32 channels. Could it be due to some issue on my DataLoader?
Here is the Dataset class:
class LogmelDataset(Dataset):
def init(self, h5_file, feature_type, transform=None):
self.h5_file = h5py.File(“bird_features.h5”, “r”)
self.feature_imgs = self.h5_file[feature_type]
self.transform = transform
def __len__(self):
return len(self.feature_imgs)
def __getitem__(self, idx):
if self.transform:
img = self.transform(self.feature_imgs[idx])
return img
return self.feature_imgs[idx]
Here is the model I’m using:
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(32, 48, 3, padding=(1, 2)),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(48, 16, 3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)
self.decoder = nn.Sequential(
nn.ConvTranspose2d(16, 48, 2, stride=2),
nn.ReLU(),
nn.ConvTranspose2d(48, 32, 2, stride=2),
nn.ReLU(),
nn.Conv2d(32, 32, 2, stride=1, padding=(1, 0)),
nn.ReLU(),
nn.Conv2d(32, 32, 2, stride=1, padding=(0, 0)),
nn.Sigmoid()
)
def forward(self, X):
encoded = self.encoder(X)
decoded = self.decoder(encoded)
return decoded
I’m trying to use an autoencoder to reconstruct bird-singing spectrograms. Currently using this dataset: bird song data set | Kaggle
And the data loader is being used as follow:
logmel_dataset = LogmelDataset(“bird_features.h5”,
“logmel”)
loader = torch.utils.data.DataLoader(
dataset=logmel_dataset,
batch_size=32,
shuffle=True
)
For some reason the training works perfectly but suddenly fails when processing the last batch. All the batches have the shape [32, 128, 94], but the last batch is having a shape of [14, 128, 94].