RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5, but got 2-dimensional input of size [20, 76800] instead

img, labels = data
optimizer.zero_grad()
output = model(img)
Target as in, here what I have passed to the model function or the train_loss = criterion(output)
and what is the significance of nb_classes, I mean how to determine its value.

nb_classes is the number of classes in your use case.
You would usually use an output and target to calculate the loss in a criterion (at least for a supervised classification use case).

We would need to get the shapes and some more information to be able to help further. :slight_smile:

for epoch in range(num_epochs):
for data in traindataloader:
img, _ = data
recon = model(img)
loss = criterion(recon, img)
So, in this case, recon is the output and img is the original image in the trainloader. So, we consider the classes to be use, then img variable is correct. I have a image dataset with random images, so how to determine the classes for that or just randomly choose any.

uts = []
for epoch in range(num_epochs):
for data in traindataloader:
img, labels = data
optimizer.zero_grad()
print(img.size())
img = Variable(img)
print(img.size())
labels = Variable(labels)
print(labels.size())
Size of data:[10,1,240,340]
Size of img :[ 10,1,240,340]

Does your input tensor contain class indices or does it contain some floating point values in a specific range, e.g. [0, 1]?
In the latter case you wonā€™t be able to use nn.CrossEntropyLoss without converting the input to class indices.

There are two problems that have bothered me for a long time

  1. Whether every time before I need to use linear layer in network modelļ¼ŒI need to use the flatten layer first
  2. if 1ļ¼‰is true,Can I use view () or reshape () instead of flatten layer
  1. If you were using e.g. nn.Conv2d layers before the nn.Linear layer, then you usually would flatten the activation from the shape [batch_size, channels, height, width] to [batch_size, in_features]. While this is the ā€œstandardā€ use case note that linear layer can accept inputs with multiple additional dimensions as [batch_size, *, in_features]. In this use case the linear layer would be applied to all additional dimensions (*) as if you would pass it in a loop. That being said, itā€™s a special use case and not the common one.

  2. I would recommend to use view, as it would not trigger a copy of the data, but instead change the metadata of the tensor. If you get an error claiming your tensor is not contiguous, you would have to trigger the copy via tensor.contiguous().view(...) or via tensor.reshape(...). Personally, Iā€™m used to using view instead of flatten, but that doesnā€™t mean itā€™s the recommended way, so you can pick whatever works for you. :wink:

1 Like

Thank you very much for your answer. I learned a lot. I may also need some practice to deepen my understanding. Thanks again.