Hi,
I am getting a memory error during batch norm in the testing phase(model.eval()), changing the batch size (even as low as 1) doesn’t fix it; the training works fine. Can anyone kindly take a look?. The testing size is 64k. Be safe!
File "/home/saby2k13/projects/ctb-ilie/saby2k13/pyDPPI/tune/HyperTuneTwoStreamRPv1/DPPIPSSM128.py", line 514, in <module>
main()
File "/home/saby2k13/projects/ctb-ilie/saby2k13/pyDPPI/tune/HyperTuneTwoStreamRPv1/DPPIPSSM128.py", line 393, in main
results = test(model.eval(),tst_loader,device)
File "/home/saby2k13/projects/ctb-ilie/saby2k13/pyDPPI/tune/HyperTuneTwoStreamRPv1/DPPIPSSM128.py", line 439, in test
output = net(crop0, crop1).cuda()
File "/project/6034601/saby2k13/pyDPPI/pyDPPIv2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/saby2k13/projects/ctb-ilie/saby2k13/pyDPPI/tune/HyperTuneTwoStreamRPv1/DPPIPSSM128.py", line 267, in forward
output2 = self.forward_once(input2)
File "/home/saby2k13/projects/ctb-ilie/saby2k13/pyDPPI/tune/HyperTuneTwoStreamRPv1/DPPIPSSM128.py", line 214, in forward_once
output = self.bn2(output)
File "/project/6034601/saby2k13/pyDPPI/pyDPPIv2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/project/6034601/saby2k13/pyDPPI/pyDPPIv2/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 81, in forward
exponential_average_factor, self.eps)
File "/project/6034601/saby2k13/pyDPPI/pyDPPIv2/lib/python3.6/site-packages/torch/nn/functional.py", line 1670, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 13.97 GiB already allocated; 3.50 MiB free; 733.67 MiB cached)