Hello, I’m training a vision transformer on the custom dataset for regression purpose.
The predicted size resulted from the network torch.Size([8]), but the input label size is torch.Size([8, 3, 224, 224]) which have different sizes.
I’ve tried with the resize() function but got AttributeError: ‘Tensor’ object has no attribute ‘Resize’.
I also tested with torch.resize() and _resize() functions too but not work.
So, pls kindly suggest to me the standard way to resize.
Could you describe what the target represents as it seems you are trying to reuse the model input as the target tensor?
I would not recommend to resize the tensors but to check what your use case is and what the dimensions of the target represent as they don’t seem to fit a classification or segmentation use case.
The target represents the class label tensor object.
Firstly with the data train loader, the dataset is read with img_data and label.
Then, img_data is trained in the net, and the labels tensor object is just as it is.
The pred_d is the tensor outputted from the trained net (vision transformer in the current case).
If my explanation doesn’t meet ur question, please kindly point me again.
Thank you.
Your explanation doesn’t explain why the target tensors have 4 dimensions:
If the targets are plain labels, I would expect a 1D tensor in the shape [batch_size]
containing class indices while the model should output a 2D tensor in the shape [batch_size, nb_classes]
for a multi-class classification use case.
My case is the regression case and the labels (scores) are in floating points.
4 dimensions of targe tensors are supposed to [outputted tensors from Dropout, dimensions of the image, width of the image, height of the image].
Now, I am about to compute the cross-entropy loss with the criterion function for the target and label as:
It works now with the long awaiting time.