import torch
from torch.autograd import Variable
import torchvision
# cpu
input = Variable(torch.rand(1, 3, 224, 224), requires_grad = True);
target = Variable(torch.LongTensor([12]));
net = torchvision.models.resnet18(pretrained = True);
predict = net(input);
criterion = torch.nn.CrossEntropyLoss();
loss = criterion(predict, target);
loss.backward();
print(input.grad);
# gpu
input = Variable(torch.rand(1, 3, 224, 224), requires_grad = True).cuda();
target = Variable(torch.LongTensor([12])).cuda();
net = torchvision.models.resnet18(pretrained = True).cuda();
predict = net(input);
criterion = torch.nn.CrossEntropyLoss();
loss = criterion(predict, target);
loss.backward();
print(input.grad);
I run upper code on my pytorch(v0.2), but get different results.
If I use CPU to calculate gradients, then input data can calculate correct gradients. If I use GPU to calculate gradients, then the Variable(input).grad is none. Results are as below:
Variable containing:
( 0 , 0 ,.,.) =
-2.4594e-03 -9.2436e-04 -2.2661e-03 ... -2.1439e-04 7.1777e-04 -1.0200e-03
-2.8949e-03 2.3239e-03 5.5778e-03 ... 5.9012e-04 2.5235e-03 -1.2698e-03
-7.9948e-03 -6.1932e-03 6.5822e-03 ... -2.3984e-03 1.3701e-03 -1.5826e-03
... ...
-2.6535e-03 5.0866e-03 9.6745e-03 ... 1.0311e-02 3.6533e-03 2.8187e-03
2.3752e-04 -3.3087e-04 -5.7010e-04 ... -6.2533e-05 4.6445e-04 1.5096e-03
-2.4068e-04 -2.4299e-04 -2.5069e-03 ... 2.5801e-03 2.3278e-03 -5.6533e-04
( 0 , 1 ,.,.) =
-4.9766e-03 -5.9452e-03 -7.1767e-03 ... 2.9989e-03 2.3204e-03 -1.2527e-05
-4.6509e-03 -1.2889e-03 3.8101e-03 ... 5.5863e-03 4.9677e-03 4.5645e-04
-1.2904e-02 -1.2313e-02 4.5021e-03 ... 4.9497e-03 4.0524e-03 1.1087e-03
... ...
6.4054e-03 1.2937e-02 1.7449e-02 ... 1.0382e-02 6.1005e-04 -4.8517e-04
5.3175e-03 4.4367e-03 4.5832e-03 ... -3.8679e-03 -5.8353e-03 -3.5039e-03
4.8981e-03 5.2441e-03 2.7703e-03 ... -7.3002e-05 -1.7121e-03 -3.5452e-03
( 0 , 2 ,.,.) =
-3.6671e-03 -3.8698e-03 -5.9004e-03 ... 1.5832e-03 9.6964e-04 4.3112e-04
-3.9912e-03 -2.6996e-03 -2.1368e-03 ... 2.4672e-03 2.9330e-03 9.5022e-04
-8.2490e-03 -1.0680e-02 -1.7551e-03 ... 2.8387e-03 3.4498e-03 2.0305e-03
... ...
-3.3654e-03 -2.4266e-04 3.1803e-03 ... 2.7013e-03 -2.0120e-04 2.9582e-04
-2.9952e-04 -5.4405e-04 -2.9528e-04 ... -4.5974e-03 -3.2213e-03 -1.6316e-03
8.2849e-04 1.1150e-03 -3.4638e-04 ... 1.1193e-04 -4.8881e-05 -1.3465e-03
[torch.FloatTensor of size 1x3x224x224]
None
The most difficult thing to understand is in my gpu version code, the gradient seems never backward to input data.