After training one model, I tried to use it calculate the gradient of loss of test data, but it kept increasing the allocated CUDA memory when going through test loader, finally CUDA out of memory.
def show_mem_usage(device=1):
gpu_stats = gpustat.GPUStatCollection.new_query()
item = gpu_stats.jsonify()["gpus"][device]
print('Used/total: ' + "{}/{}".format(item["memory.used"], item["memory.total"]))
def compute_grad_testdata(model, test_loader):
model.eval()
batch = 1
loss = 0
show_mem_usage()
for x_test, y_test in test_loader:
show_mem_usage()
with torch.cuda.device(CUDA_VISIBLE_DEVICES):
x_test = x_test.cuda()
y_test = y_test.cuda()
score, feature, _ = model(x_test)
batch_loss = calc_loss(score, y_test)
loss += batch_loss
batch += 1
show_mem_usage()
model.zero_grad()
show_mem_usage()
loss = loss/batch
params = [p for p in model.parameters() if p.requires_grad]
return list(grad(loss, params, create_graph=True))
Used/total: 3362/24268
Used/total: 3362/24268
Used/total: 3362/24268
Used/total: 3362/24268
Used/total: 3366/24268
Used/total: 3366/24268
Used/total: 3370/24268
Used/total: 3370/24268
Used/total: 3376/24268
Used/total: 3376/24268
Used/total: 3380/24268
Used/total: 3380/24268
Used/total: 3384/24268
Used/total: 3384/24268
Used/total: 3390/24268
Used/total: 3390/24268
Used/total: 3526/24268
Used/total: 3526/24268
Used/total: 3680/24268
Used/total: 3680/24268
Used/total: 3848/24268
Used/total: 3848/24268
Used/total: 4060/24268
Used/total: 4060/24268
Used/total: 4214/24268
Used/total: 4214/24268
Used/total: 4426/24268
Used/total: 4426/24268
Used/total: 4578/24268
Used/total: 4578/24268
Used/total: 4792/24268
Used/total: 4792/24268
Used/total: 4944/24268
Used/total: 4944/24268
Used/total: 5158/24268
Used/total: 5158/24268
Used/total: 5310/24268
Used/total: 5310/24268
Used/total: 5522/24268
Used/total: 5522/24268
Used/total: 5676/24268
Used/total: 5676/24268
Used/total: 5888/24268
Used/total: 5888/24268
Used/total: 6040/24268
Used/total: 6040/24268
Used/total: 6254/24268
Used/total: 6254/24268
Used/total: 6406/24268
Used/total: 6406/24268
Used/total: 6620/24268
Used/total: 6620/24268
Used/total: 6772/24268
Used/total: 6772/24268
Used/total: 6984/24268
Used/total: 6984/24268
Used/total: 7138/24268
Used/total: 7138/24268
Used/total: 7350/24268
Used/total: 7350/24268
Used/total: 7504/24268
Used/total: 7504/24268
Used/total: 7716/24268
Used/total: 7716/24268
Used/total: 7868/24268
Used/total: 7868/24268
Used/total: 8082/24268
Used/total: 8082/24268
Used/total: 8234/24268
Used/total: 8234/24268
Used/total: 8446/24268
Used/total: 8446/24268
Used/total: 8600/24268
Used/total: 8600/24268
Used/total: 8812/24268
Used/total: 8812/24268
Used/total: 8966/24268
Used/total: 8966/24268
Used/total: 9178/24268
Used/total: 9178/24268
Used/total: 9330/24268
Used/total: 9330/24268
Used/total: 9544/24268
Used/total: 9544/24268
Used/total: 9696/24268
Used/total: 9696/24268
Used/total: 9908/24268
Used/total: 9908/24268
Used/total: 10062/24268
Used/total: 10062/24268
Used/total: 10274/24268
Used/total: 10274/24268
Used/total: 10428/24268
Used/total: 10428/24268
Used/total: 10640/24268
Used/total: 10640/24268
Used/total: 10792/24268
Used/total: 10792/24268
Used/total: 11006/24268
Used/total: 11006/24268
Used/total: 11158/24268
Used/total: 11158/24268
Used/total: 11372/24268
Used/total: 11372/24268
Used/total: 11524/24268
Used/total: 11524/24268
Used/total: 11736/24268
Used/total: 11736/24268
Used/total: 11890/24268
Used/total: 11890/24268
Used/total: 12102/24268
Used/total: 12102/24268
Used/total: 12254/24268
Used/total: 12254/24268
Used/total: 12468/24268
Used/total: 12468/24268
Used/total: 12620/24268
Used/total: 12620/24268
Used/total: 12834/24268
Used/total: 12834/24268
Used/total: 12986/24268
Used/total: 12986/24268
Used/total: 13198/24268
Used/total: 13198/24268
Used/total: 13352/24268
Used/total: 13352/24268
Used/total: 13564/24268
Used/total: 13564/24268
Used/total: 13716/24268
Used/total: 13716/24268
Used/total: 13930/24268
Used/total: 13930/24268
Used/total: 14082/24268
Used/total: 14082/24268
Used/total: 14296/24268
Used/total: 14296/24268
Used/total: 14448/24268
Used/total: 14448/24268
Used/total: 14660/24268
Used/total: 14660/24268
Used/total: 14814/24268
Used/total: 14814/24268
Used/total: 15026/24268
Used/total: 15026/24268
Used/total: 15180/24268
Used/total: 15180/24268
Used/total: 15392/24268
Used/total: 15392/24268
Used/total: 15544/24268
Used/total: 15544/24268
Used/total: 15758/24268
Used/total: 15758/24268
Used/total: 15910/24268
Used/total: 15910/24268
Used/total: 16122/24268
Used/total: 16122/24268
Used/total: 16276/24268
Used/total: 16276/24268
Used/total: 16288/24268
Used/total: 16288/24268
........
RuntimeError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.70 GiB total capacity; 22.07 GiB already allocated; 7.69 MiB free; 22.34 GiB reserved in total by PyTorch)
I know that usually for testing, we need use “with torch.no_grad()” such that it will set requires_grad=False for new Tensors, I have tried this, it is true that it wont use more memory, but if I do this in my case I can not use autograd.grad to calculate the gradient. So any solution for calculate gradient on test data without increasing allocating CUDA memory. Thanks.