Error: Input and target shapes do not match

Hey guys.

I try to get acquainted with neural networks and PyTorch.

Usually first af all I just want to understand what results I can achieve and how quickly, if I use something and do not go into details much.

I liked the idea of GAN networks.

I found this example, downloaded it and did the following:

  1. created a folder inside the repository named Img
    in this folder I created another folder Img
    and copied there several cat faces in jpg.
    (This is my training data)

  2. in the config file, in line
    parser.add_argument(’–train_data_root’, type=str, default=’/home1/irteam/nashory/data/CelebA/Img’)
    I replaced ‘/home1/irteam/nashory/data/CelebA/Img’ by ‘Img’.
    This is the path to folder with my training data (cats).

  3. ran trainer.py

As result I have this error:

----------------- configuration -----------------
  flag_wn: True
  stab_tick: 100
  flag_norm_latent: False
  flag_gdrop: True
  optimizer: adam
  TICK: 1000
  ngf: 512
  flag_sigmoid: False
  nc: 3
  smoothing: 0.997
  save_img_every: 20
  nz: 512
  lr: 0.001
  beta2: 0.99
  flag_bn: False
  train_data_root: Img
  trns_tick: 200
  lr_decay: 0.87
  display_tb_every: 5
  flag_add_drift: True
  flag_tanh: False
  random_seed: 1537970835
  beta1: 0.0
  flag_leaky: True
  eps_drift: 0.001
  max_resl: 8
  n_gpu: 1
  flag_add_noise: True
  use_tb: True
  ndf: 512
  flag_pixelwise: True
-------------------------------------------------
/media/me2beats/601616D21616A8D4/ubuntu_download/nvidia gans/pggan-pytorch-master/custom_layers.py:104: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
  if initializer == 'kaiming':    kaiming_normal(self.conv.weight, a=calculate_gain('conv2d'))
/media/me2beats/601616D21616A8D4/ubuntu_download/nvidia gans/pggan-pytorch-master/custom_layers.py:137: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
  if initializer == 'kaiming':    kaiming_normal(self.linear.weight, a=calculate_gain('linear'))
Generator structure: 
Sequential(
  (first_block): Sequential(
    (0): equalized_conv2d(
      (conv): Conv2d(512, 512, kernel_size=(4, 4), stride=(1, 1), padding=(3, 3), bias=False)
    )
    (1): LeakyReLU(negative_slope=0.2)
    (2): pixelwise_norm_layer()
    (3): equalized_conv2d(
      (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (4): LeakyReLU(negative_slope=0.2)
    (5): pixelwise_norm_layer()
  )
  (to_rgb_block): Sequential(
    (0): equalized_conv2d(
      (conv): Conv2d(512, 3, kernel_size=(1, 1), stride=(1, 1), bias=False)
    )
  )
)
Discriminator structure: 
Sequential(
  (from_rgb_block): Sequential(
    (0): generalized_drop_out(mode = prop, strength = 0.0, axes = [0, 1], normalize = False)
    (1): equalized_conv2d(
      (conv): Conv2d(3, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
    )
    (2): LeakyReLU(negative_slope=0.2)
  )
  (last_block): Sequential(
    (0): minibatch_std_concat_layer(averaging = all)
    (1): generalized_drop_out(mode = prop, strength = 0.0, axes = [0, 1], normalize = False)
    (2): equalized_conv2d(
      (conv): Conv2d(513, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (3): LeakyReLU(negative_slope=0.2)
    (4): generalized_drop_out(mode = prop, strength = 0.0, axes = [0, 1], normalize = False)
    (5): equalized_conv2d(
      (conv): Conv2d(512, 512, kernel_size=(4, 4), stride=(1, 1), bias=False)
    )
    (6): LeakyReLU(negative_slope=0.2)
    (7): Flatten()
    (8): equalized_linear(
      (linear): Linear(in_features=512, out_features=1, bias=False)
    )
  )
)
[*] Renew dataloader configuration, load data from Img.
/usr/local/lib/python2.7/dist-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
trainer.py:245: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.z_test = Variable(self.z_test, volatile=True)
  0%|                                                 | 0/18750 [00:00<?, ?it/s]Exception KeyError: KeyError(<weakref at 0x7f495fb62788; to 'tqdm' at 0x7f495fb5fd50>,) in <bound method tqdm.__del__ of   0%|                                                 | 0/18750 [00:01<?, ?it/s]> ignored
Traceback (most recent call last):
  File "trainer.py", line 352, in <module>
    trainer.train()
  File "trainer.py", line 273, in train
    loss_d = self.mse(self.fx, self.real_label) + self.mse(self.fx_tilde, self.fake_label)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/loss.py", line 421, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1716, in mse_loss
    return _pointwise_loss(lambda a, b: (a - b) ** 2, torch._C._nn.mse_loss, input, target, reduction)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1674, in _pointwise_loss
    return lambd_optimized(input, target, reduction)
RuntimeError: input and target shapes do not match: input [32 x 1], target [32] at /pytorch/aten/src/THNN/generic/MSECriterion.c:12

I can see here some dimensional error - input is [32 x 1], but target is [32].
The search for the problem is complicated because I have not yet realized what these numbers mean.

I just can note that when I change the number of images in the training folder, the input shape also changes:
input [1 x 1] - for 1 photo in Img folder, input [2 x 1] - for 2 photos etc, but never bigger than input [32 x 1]

Also I see 2 user warnings:

  1. transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
    “please use transforms.Resize instead.”)

  2. trainer.py:245: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.

But If I understand right, this is not that causes the error.
[because I tried to replace deprecated transforms.Scale with transforms.Resize,
(just changed the name of the method)
and the error remained]

It is also unlikely that this is due to the properties of the input files. For example, I do not think that they should be the same size or square.

My opinion is, somewhere, the error is because of I’m using a new version of PyTorch (I pip installed it just a few hours ago). Rather, it’s a feature, not an issue or bug.

Any ideas, because of what it can be and how to solve the problem?

I am not sure if it will solve the issue. But can you try changing the line 273 in trainer.py as below:

loss_d = self.mse(self.fx.squeeze(), self.real_label) + self.mse(self.fx_tilde.squeeze(), self.fake_label)

The error remained.
But I see this now:
input [1], target [32] - if there’s 1 jpg file in the folder
input [2], target [32] - if there’s 2 files
etc

instead of
input [1 x 1], target [32]
input [2 x 1], target [32]

But if I have 32 or more files, I get input [32 x 1], target [32] again

And yes it seems this problem can be connected with squeeze()
Here is something similar.

You are doing a binary classification so the output of the module will give you a 2d output where first dim is batch size and the second dim is the output size. So, you need to squeeze the output or unsqueeze the real labels.
I cannot say anything for sure because I cannot see all the codes but as far as I understand you are using a data loader with batch_size of 32. If you have less than batch size, data loader will use number of example. If you have more example than batch size you will get output for each batch.

Did you find a solution @me2beats ?