I am using a VGG16 model with a few linear layers for a classification task. Due to the complexity of the problem, I had to apply the following transpose as shown in the code below.
class PatchNorm:
#def __init__(self):
def __call__(self, patch):
patch = np.swapaxes(patch[:, :, 0:3], -1, 0)
patch = np.stack([patch[2, :, :], patch[1, :, :], patch[0, :, :] ])
patch -= patch.min()
patch /= patch.max()
patch *= 255 # [0, 255] range
patch= patch.astype(np.uint8)
patch = np.swapaxes((patch), 0, -1)
patch = torch.tensor(patch)
patch = torch.permute(patch, (2, 0, 1))
patch = transforms.CenterCrop(size=(64, 64))(patch)
patch = transforms.Resize(size = (224, 224), interpolation = InterpolationMode.BICUBIC)(patch)/255
patch = transforms.Normalize((0.485, 0.456, 0.406),(0.229, 0.224, 0.225))(patch).to(dtype =torch.float32)
#patch = torch.permute(patch, (2, 0, 1))
return (patch)
I noticed that running this model on a TeslaV100 was almost as fast as running it on the cpu (Intel(R) Xeon(R) Silver 4215). I changed the num_workers in the dataloader to 8 and saw a significant rise in speed (from 1.5 hours to 14 mins). But with a higher num_workers the run was a bit laggy.
I was wondering whether a transforms as described above could be the cause of the slow-down? Thank you.
P.S. : Regarding the transforms, I have a .tiff Image with 10 channels, of which I only wish to take 3, reorder, min-max norm, multiply by 255, center crop to utilize only 64x64 pixels, divide by 255 (unnecessary I know), upsample to 224x224, use vgg16’s mean and std, reorder channels and return.