RuntimeError: The size of tensor a (224) must match the size of tensor b (8) at non-singleton dimension 3

naychelynn · October 25, 2022, 12:42pm

Hello, I’m training a vision transformer on the custom dataset for regression purpose.
The predicted size resulted from the network torch.Size([8]), but the input label size is torch.Size([8, 3, 224, 224]) which have different sizes.
I’ve tried with the resize() function but got AttributeError: ‘Tensor’ object has no attribute ‘Resize’.
I also tested with torch.resize() and _resize() functions too but not work.
So, pls kindly suggest to me the standard way to resize.

ptrblck · October 26, 2022, 4:42am

Could you describe what the target represents as it seems you are trying to reuse the model input as the target tensor?
I would not recommend to resize the tensors but to check what your use case is and what the dimensions of the target represent as they don’t seem to fit a classification or segmentation use case.

naychelynn · October 27, 2022, 5:55am

The target represents the class label tensor object.
Firstly with the data train loader, the dataset is read with img_data and label.
Then, img_data is trained in the net, and the labels tensor object is just as it is.
The pred_d is the tensor outputted from the trained net (vision transformer in the current case).
If my explanation doesn’t meet ur question, please kindly point me again.
Thank you.

ptrblck · October 27, 2022, 7:48am

Your explanation doesn’t explain why the target tensors have 4 dimensions:

If the targets are plain labels, I would expect a 1D tensor in the shape [batch_size] containing class indices while the model should output a 2D tensor in the shape [batch_size, nb_classes] for a multi-class classification use case.

naychelynn · October 30, 2022, 6:45am

My case is the regression case and the labels (scores) are in floating points.
4 dimensions of targe tensors are supposed to [outputted tensors from Dropout, dimensions of the image, width of the image, height of the image].

Now, I am about to compute the cross-entropy loss with the criterion function for the target and label as:

It works now with the long awaiting time.