ValueError: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 25]))

Hi, Trying to fine-tune a Convnext Tiny model.

This is an initialization of the model.

from transformers import AutoModelForImageClassification,ConvNextFeatureExtractor,get_linear_schedule_with_warmup

model = AutoModelForImageClassification.from_pretrained("facebook/convnext-tiny-224",
                                                        num_labels=25,ignore_mismatched_sizes = True)

This is a Dataset Class.

class Groceries(Dataset):
  def __init__(self,path,df,transform = None):
    self.path = path
    self.df=df
    self.transform =transform


  def __len__(self):
    return len(self.df)

  def __getitem__(self,index):
    img_path =os.path.join(self.path,self.df['Images'][index])
    image = cv.imread(img_path)
    image= image / 255
    # print(image.type)
    label = self.df["Labels"][index]
    return torch.tensor(image).float(),torch.tensor(label).float()

The thing is that the model has implemented BinaryCrossEntropyLoss itself. so, once i pass image and label tensors to the model, model returns loss and logits. but the classification is multi-class. so my label size is just a batch_size[32]. the output of the model, the logits size is [batch_size,num_labels]. so when these 2 go to binarycrossentropy, it throws an error mentioned in the title.

What should i do ? is it possible to not use implemented binarycrossentropy loss function from the model and use my own ? or is there any other solution to that? During model initialization i mentioned 25 classes.

Could you link to the code initializing the internal criterion as nn.BCELoss instead of nn.CrossEntropyLoss for a multi-class classification?

That’s the problem. I did not initialize that and i could not find where to initialize my preferred loss function. i found out that loss was binary cross entropy only from the error message :

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/convnext/modeling_convnext.py in forward(self, pixel_values, labels, output_hidden_states, return_dict)
    461             elif self.config.problem_type == "multi_label_classification":
    462                 loss_fct = BCEWithLogitsLoss()
--> 463                 loss = loss_fct(logits, labels)
    464         if not return_dict:
    465             output = (logits,) + outputs[2:]

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
    718 
    719     def forward(self, input: Tensor, target: Tensor) -> Tensor:
--> 720         return F.binary_cross_entropy_with_logits(input, target,
    721                                                   self.weight,
    722                                                   pos_weight=self.pos_weight,

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in binary_cross_entropy_with_logits(input, target, weight, size_average, reduce, reduction, pos_weight)
   3161 
   3162     if not (target.size() == input.size()):
-> 3163         raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
   3164 
   3165     return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)

ValueError: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 25]))

I’m not familiar with these higher-level wrappers, so you might need to check the documentation of AutoModelForImageClassification to see if and how the loss function can be changed, or you could try to create the model standalone and the desired criterion manually.

Thanks. but if i create the model standalone, can i retrieve pre-trained weights?

It appears that, during training, it is optional to input labels to the model. if you don’t input you can use your own loss function.
So instead of

outputs = model(input,labels) 
# we can do
outputs = model(input) # and this will not and Can't calculate a loss