I’m using the standard ResNet50 from torchvision model with an added 512x128 FC layer. Calling model.half()
on it only bring it down from 940mb to 870mb GPU usage. Shouldn’t there be a more significant reduction in memory usage?
I call torch.cuda.empty_cache()
after initializing the model. I’ve set torch.backends.cudnn.benchmark = True
as well.
Same behavior on Tesla T4 and 2080Ti.
model:
model = torchvision.models.resnet50(
pretrained=False, progress=False)
model.fc = torch.nn.Sequential(
torch.nn.Linear(self.model.fc.in_features, 512),
torch.nn.Linear(512, 128),
torch.nn.Linear(128, 1)
)
## section to load weights ##
if use_fp16:
model.half()
model.to(compute_device)
model.eval()
torch.cuda.empty_cache()