RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5, but got 2-dimensional input of size [20, 76800] instead

rai24 · March 9, 2020, 12:53pm

img, labels = data
optimizer.zero_grad()
output = model(img)
Target as in, here what I have passed to the model function or the train_loss = criterion(output)
and what is the significance of nb_classes, I mean how to determine its value.

ptrblck · March 9, 2020, 11:27pm

nb_classes is the number of classes in your use case.
You would usually use an output and target to calculate the loss in a criterion (at least for a supervised classification use case).

We would need to get the shapes and some more information to be able to help further.

rai24 · March 11, 2020, 4:08am

for epoch in range(num_epochs):
for data in traindataloader:
img, _ = data
recon = model(img)
loss = criterion(recon, img)
So, in this case, recon is the output and img is the original image in the trainloader. So, we consider the classes to be use, then img variable is correct. I have a image dataset with random images, so how to determine the classes for that or just randomly choose any.

rai24 · March 11, 2020, 4:16am

uts = []
for epoch in range(num_epochs):
for data in traindataloader:
img, labels = data
optimizer.zero_grad()
print(img.size())
img = Variable(img)
print(img.size())
labels = Variable(labels)
print(labels.size())
Size of data:[10,1,240,340]
Size of img :[ 10,1,240,340]

ptrblck · March 11, 2020, 4:23am

Does your input tensor contain class indices or does it contain some floating point values in a specific range, e.g. [0, 1]?
In the latter case you won’t be able to use nn.CrossEntropyLoss without converting the input to class indices.

hpf · August 30, 2020, 3:27am

There are two problems that have bothered me for a long time

Whether every time before I need to use linear layer in network model，I need to use the flatten layer first
if 1）is true,Can I use view () or reshape () instead of flatten layer

ptrblck · August 30, 2020, 4:34am

If you were using e.g. nn.Conv2d layers before the nn.Linear layer, then you usually would flatten the activation from the shape [batch_size, channels, height, width] to [batch_size, in_features]. While this is the “standard” use case note that linear layer can accept inputs with multiple additional dimensions as [batch_size, *, in_features]. In this use case the linear layer would be applied to all additional dimensions (*) as if you would pass it in a loop. That being said, it’s a special use case and not the common one.
I would recommend to use view, as it would not trigger a copy of the data, but instead change the metadata of the tensor. If you get an error claiming your tensor is not contiguous, you would have to trigger the copy via tensor.contiguous().view(...) or via tensor.reshape(...). Personally, I’m used to using view instead of flatten, but that doesn’t mean it’s the recommended way, so you can pick whatever works for you.

hpf · August 30, 2020, 6:51am

Thank you very much for your answer. I learned a lot. I may also need some practice to deepen my understanding. Thanks again.