Hey Everyone,
I am running into a strange error message when training a VGG-like network:
RuntimeError: Expected tensor for argument #1 'input' to have the same dimension as tensor for 'result'; but 4 does not equal 2 (while checking arguments for cudnn_convolution)
Has anyone come across this before? I am running hyperparameter optimization over my network’s convolution filter sizes, and I wonder if some combination of filer sizes is causing this error. I have tried manually configuring the kernel sizes but have been unable to reproduce this error outside of my optimization code.
Here is the full error:
Traceback (most recent call last):
File "scikit_opt.py", line 113, in <module>
main()
File "scikit_opt.py", line 110, in main
res_gp = gp_minimize(objective, hparams, n_calls=10, verbose=True)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/scikit_optimize-0.5.1-py3.6.egg/skopt/optimizer/gp.py", line 228, in gp_minimize
File "/home/ygx/anaconda3/lib/python3.6/site-packages/scikit_optimize-0.5.1-py3.6.egg/skopt/optimizer/base.py", line 253, in base_minimize
File "scikit_opt.py", line 69, in objective
outputs = net(inputs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 68, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 78, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 68, in parallel_apply
raise output
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 44, in _worker
output = module(*input, **kwargs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/ygx/paper_hyperspace/vgg_cifar/adaptive_model.py", line 106, in forward
x = self.block3(x)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 75, in forward
input = module(input)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/ygx/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected tensor for argument #1 'input' to have the same dimension as tensor for 'result'; but 4 does not equal 2 (while checking arguments for cudnn_convolution)
Here is the network when the above error occured:
VGG(
(block1): Sequential(
(0): Conv2d(3, 64, kernel_size=(10, 10), stride=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU()
(3): Conv2d(64, 128, kernel_size=(2, 2), stride=(1, 1))
(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU()
(6): AdaptiveMaxPool2d(output_size=16)
)
(block2): Sequential(
(0): Conv2d(128, 128, kernel_size=(6, 6), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU()
(3): Conv2d(128, 256, kernel_size=(5, 5), stride=(1, 1))
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU()
(6): AdaptiveMaxPool2d(output_size=7)
(7): Dropout(p=0.5)
)
(block3): Sequential(
(0): Conv2d(256, 256, kernel_size=(5, 5), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU()
(3): Conv2d(256, 256, kernel_size=(9, 9), stride=(1, 1), padding=(1, 1))
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU()
(6): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1))
(7): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(8): ReLU()
(9): AdaptiveMaxPool2d(output_size=2)
(10): Dropout(p=0.5)
)
(block4): Sequential(
(0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU()
(3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU()
(6): Conv2d(512, 512, kernel_size=(2, 2), stride=(1, 1), padding=(1, 1))
(7): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(8): ReLU()
(9): AdaptiveMaxPool2d(output_size=1)
(10): Dropout(p=0.5)
(11): AdaptiveAvgPool2d(output_size=1)
)
(linear_layers): Sequential(
(0): Linear(in_features=512, out_features=10, bias=True)
)
)
Edit: I had come across this issue after a quick search, but it doesn’t give much insight into the error yet:
I would appreciate a fresh pair of eyes!