hi,
any explanation to this:
case 1: forward minbatch: gpu mem required: 3gb
model = create_model()
model.eval()
x = torch.rand(32, 3, 224, 224)
with torch.no_grad():
model(x) # executing just this instruction takes 3gb
case 2: forward one sample. gpu mem required: 11gb
model = create_model()
model.eval()
x = torch.rand(32, 3, 224, 224)
with torch.no_grad():
model(x[0].unsqueeze(0)) # executing only this instruction: gpu memory 11gb.
what possibly can cause this?
torch 1.10.0
forward is done within ‘torch.cuda.amp.autocast(enabled=True)’
thanks