I’m just starting with PyTorch (coming from Theano) and it’s awesome!
I posted this as an issue on gitHub https://github.com/pytorch/pytorch/issues/4742, but maybe it’s just me …
I put together a linear regressor with a single hidden layer and it works fine but crashes if I try pushing it to cuda. There might be plenty of wrong with the code but I don’t know where to look. Here’s the model and the code that crashes it:
class Net(torch.nn.Module):
def __init__(self, n_feature, n_hidden, n_output):
super(Net, self).__init__()
self.n_feature = n_feature
self.n_hidden = n_hidden
self.n_output = n_output
self.wh = Parameter(torch.Tensor(n_feature, n_hidden))
self.bh = Parameter(torch.Tensor(n_hidden))
self.wy = Parameter(torch.Tensor(n_hidden, n_output))
self.by = Parameter(torch.Tensor(n_output))
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / np.sqrt(self.wh.size(1))
self.wh.data.uniform_(-stdv, stdv)
self.bh.data.uniform_(-stdv, stdv)
stdv = 1. / np.sqrt(self.wy.size(1))
self.wy.data.uniform_(-stdv, stdv)
self.by.data.uniform_(-stdv, stdv)
def forward(self, x):
h = x.mm(self.wh) + self.bh
a = F.logsigmoid(h) # activation function for hidden layer
y = a.mm(self.wy) + self.by
return y
net = Net(n_feature=1, n_hidden=10, n_output=1).cuda()
The traceback I get is:
Traceback (most recent call last):
File "<ipython-input-137-44e8bf8308ac>", line 1, in <module>
net = Net(n_feature=1, n_hidden=10, n_output=1).cuda() # define the network
File "~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 216, in cuda
return self._apply(lambda t: t.cuda(device))
File "~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 152, in _apply
param.data = fn(param.data)
File "~/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 216, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "~/miniconda3/lib/python3.6/site-packages/torch/_utils.py", line 69, in _cuda
return new_type(self.size()).copy_(self, async)
RuntimeError: cuda runtime error (4) : unspecified launch failure at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorCopy.c:20
I was guessing it’s because we don’t use nn
or functional
to define the layers, but I don’t know. If yes what is a good way to mix equations and available layers?
I also tried:
dtype = torch.cuda.FloatTensor
self.wh = Parameter(torch.Tensor(n_feature, n_hidden).type(dtype))
self.bh = Parameter(torch.Tensor(n_hidden).type(dtype))
...
But this gives the same error.