CUDA out of memory when predicting

FourierKJR · December 10, 2017, 6:45pm

Hello everyone. Yesterday, I trained a model (vgg-16 finetuning) by pytorch successfully
But today when I loaded the model and wanted to predict the results on the test set, it raised the error like this ：

THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File “predict.py”, line 36, in
results = model(X_test)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py”, line 224, in call
result = self.forward(*input, **kwargs)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torchvision/models/vgg.py”, line 41, in forward
x = self.features(x)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py”, line 224, in call
result = self.forward(*input, **kwargs)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/modules/container.py”, line 67, in forward
input = module(input)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py”, line 224, in call
result = self.forward(*input, **kwargs)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py”, line 254, in forward
self.padding, self.dilation, self.groups)
File “/home/junze/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py”, line 52, in conv2d
return f(input, weight, bias)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66

I chose another GPU to do this, but it also failed.
I don’t know why it failed. I trained the model successfully，the GPU works well，but when I used the same model to predict , it told me ,out of memory.
So I want to know how to solve this problem, and I also want to know why it happened？
My English is pool， and I am a freshman for pytorch:laughing:
I hope you can help me!
Thanks!
Here is the code for predict

os.environ[“CUDA_VISIBLE_DEVICES”] = “2”

print(‘读取数据’)
data_set = h5py.File(‘image_for_test.h5’,‘r’)
test_data = data_set[‘X_test’][:]
results = np.zeros((test_data.shape[0],1))
X_test = torch.from_numpy(test_data).float()
print(type(X_test))
X_test = Variable(X_test).cuda()
#X_test = X_test.type(torch.FloatTensor)
print(type(X_test))

print(‘加载模型’)
model = torch.load(‘model_for_fine_tune.pkl’)

print(‘预测结果’)
results = model(X_test)
temp = np.zeros((test_data.shape[0],1))
results = results + temp

print(‘保存数据’)
sio.savemat(‘result_for_test.mat’,{‘s_label’:results})

jdhao · December 11, 2017, 1:49am

There are multiple issues with your code.

During testing phase, set Variable X_test's volatile attribute to True to save memory.
Instead of

X_test = Variable(X_test).cuda()

use

X_test = Variable(X_test.cuda(), volatile=True)

results is torch Variable. If you want to access the underlying data, use results.data instead of results itself,or you will get into trouble. (Since you are new to torch, I am not going to expand here.)

Also, you should format your post correctly and nicely using Markdown. For code, use code block.