Insufficient GPU memory due to differences in memory size between the training machine and the deployed machine

No, you shouldn’t use it during training as it will disable the gradient calculation as previously explained. Wrap the forward pass of the entire model in the guard during inference:

# inference
model.eval()
with torch.no_grad():
    out = model(x)
1 Like