Hi,
I am using AlexNet architecture for semantic segmentation with 6 classes. The architecture is:
AlexNetFCN(
(conv1): Sequential(
(0): Conv2d(3, 96, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace)
)
(pool1): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(conv2): Sequential(
(0): Conv2d(96, 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): ReLU(inplace)
)
(pool2): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(conv3): Sequential(
(0): Conv2d(256, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
)
(conv4): Sequential(
(0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
)
(conv5): Sequential(
(0): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
)
(pool3): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(fc6): Sequential(
(0): Conv2d(256, 4096, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Dropout2d(p=0.5)
)
(fc7): Sequential(
(0): Conv2d(4096, 4096, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Dropout2d(p=0.5)
)
(score_fc7): Conv2d(4096, 6, kernel_size=(1, 1), stride=(1, 1))
(rescale): UpsamplingBilinear2d(size=(227, 227), mode=bilinear)
)
But I am getting an error while performing loss.backward() and I am not able to find/get it from error trace. here is log and error trace:
Training Epoch: 0
Conv1 size => torch.Size([16, 96, 56, 56])
Pool1 size => torch.Size([16, 96, 27, 27])
Conv2 size => torch.Size([16, 256, 27, 27])
Pool2 size => torch.Size([16, 256, 13, 13])
Conv3 size => torch.Size([16, 384, 13, 13])
Conv4 size => torch.Size([16, 384, 13, 13])
Conv5 size => torch.Size([16, 256, 13, 13])
Pool3 size => torch.Size([16, 256, 6, 6])
fc6 size => torch.Size([16, 4096, 6, 6])
fc7 size => torch.Size([16, 4096, 6, 6])
score_fc7 size => torch.Size([16, 6, 6, 6])
Traceback (most recent call last):
File "semantic_alex_ce.py", line 381, in <module>
train(epoch)
File "semantic_alex_ce.py", line 269, in train
loss.backward()
File "/home/anil.k/miniconda2/envs/torch/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/anil.k/miniconda2/envs/torch/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
variables, grad_variables, retain_graph)
File "/home/anil.k/miniconda2/envs/torch/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
return self._forward_cls.backward(self, *args)
File "/home/anil.k/miniconda2/envs/torch/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 481, in backward
grad_tensor = grad_tensor.masked_scatter(mask, grad_output)
File "/home/anil.k/miniconda2/envs/torch/lib/python2.7/site-packages/torch/autograd/variable.py", line 427, in masked_scatter
return self.clone().masked_scatter_(mask, variable)
RuntimeError: invalid argument 1: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THC/generic/THCTensor.c:326
Please help me to identify where is the size issue.
Thanks in advance!
Anil