Hi everyone.
I have this training set of 2997 samples, where each sample has size 24x24x24x16. I would like to augment it by 24 times through rotation. Ideally the rotation should have been of 90 degrees, thus in order to get 23 different sample (the first one is the orignal) i would have to change the ax of rotation [(0,1), (1,0), (2,0), (0,2)] ecc. Six permutations are required.
Everthing should be on the fly, so once the sample is retrived with __getitem__
, i should generate 24 samples.
By now, i don’t think that an actual augmentation is done (i adjusted the original version of the code i am working on). I printed some tensors and indeed i don’t see any rotation, only the same sample replied.
Plus, i was considering to move everthing to tensors and not working with Numpy arrays, so i need to modify the code anyway.
This the class in which i perform the augmentation on the fly:
class AugmentedDataGenerator(Dataset):
# It initializes the instance of the class
def __init__(self, x, y, aug_count=24):
# features and labels
self.x = x
self.y = y
self.aug_count = aug_count
def __len__(self):
# multiplication for the augmentation
return len(self.x) * self.aug_count
def __getitem__(self, idx):
# Original index (actual divided by the augumentation index)
index = idx // self.aug_count
# rotation index will be in the range from 0 to 23.
rotation_index = idx % self.aug_count
# retriving the sample
sample_x = self.x[index]
sample_y = self.y[index]
aug_x = self._rotate_sample(sample_x, rotation_index)
print(aug_x.shape)
aug_y = sample_y
return aug_x, aug_y
# This method performs rotation augmentation on the input sample
def _rotate_sample(self, sample, rotation_index):
output = np.zeros_like(sample)
# axes of rotation
axes = [(1, 2), (0, 2), (0, 1)]
for axis in axes:
# rotates the sample
rotated_sample = np.rot90(sample, k=rotation_index, axes=axis)
# maximum value across all rotations along different
output = np.maximum(output, rotated_sample)
# it holds the sample that has been rotated along all specified axes
return output
This is the beginning of the training:
# Loops over each batch in train_loader
for i, (input, target) in enumerate(train_loader):
# Measure data loading time
data_time.update(time.time() - end)
# Moves both the input data and the target labels to the CPU or GPU
print("batch", i, "and input shape", input.shape)
input = input.reshape(-1, 16, 24, 24, 24).to(device, non_blocking=True)
target = target.view(-1, 1).to(device, non_blocking=True)
It would much easier to augment the data once and save to disk, but i runned out of memory.
I tried to apply this transformation:
transform =v2.Compose([
v2.ToDtype(torch.float64),
v2.RandomRotation(degrees=(0,360))
])
by doing this in the __getitem__
method:
# Retrieving a sample from the dataset given its index
def __getitem__(self, idx):
if self.transform is None:
features= self.x_data[idx]
label = self.y_data[idx]
else:
index = idx // self.aug_count
rotation_index = idx % self.aug_count
features = torch.stack([torch.tensor(self.transform(self.x_data[index])) for _ in range(24)], dim=0)
label = self.y_data[index]
return features, label
But then i have that each batch has its size multiplicated by 24 and the dimensions don’t make sense.
Batch 0 - Shape of the input torch.Size([8, 24, 24, 24, 24, 16])
To conclude, i think i’m losing the thread here. I would like to understand better these concept:
- how to apply augmentation on the fly formally (in which class / methods)
- how to deal with the resulting augmented data and the dimensions of the batches
Thank you in advance.