Hi, Trying to fine-tune a Convnext Tiny model.
This is an initialization of the model.
from transformers import AutoModelForImageClassification,ConvNextFeatureExtractor,get_linear_schedule_with_warmup
model = AutoModelForImageClassification.from_pretrained("facebook/convnext-tiny-224",
num_labels=25,ignore_mismatched_sizes = True)
This is a Dataset Class.
class Groceries(Dataset):
def __init__(self,path,df,transform = None):
self.path = path
self.df=df
self.transform =transform
def __len__(self):
return len(self.df)
def __getitem__(self,index):
img_path =os.path.join(self.path,self.df['Images'][index])
image = cv.imread(img_path)
image= image / 255
# print(image.type)
label = self.df["Labels"][index]
return torch.tensor(image).float(),torch.tensor(label).float()
The thing is that the model has implemented BinaryCrossEntropyLoss itself. so, once i pass image and label tensors to the model, model returns loss and logits. but the classification is multi-class. so my label size is just a batch_size[32]. the output of the model, the logits size is [batch_size,num_labels]. so when these 2 go to binarycrossentropy, it throws an error mentioned in the title.
What should i do ? is it possible to not use implemented binarycrossentropy loss function from the model and use my own ? or is there any other solution to that? During model initialization i mentioned 25 classes.
Could you link to the code initializing the internal criterion as nn.BCELoss
instead of nn.CrossEntropyLoss
for a multi-class classification?
That’s the problem. I did not initialize that and i could not find where to initialize my preferred loss function. i found out that loss was binary cross entropy only from the error message :
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.10/dist-packages/transformers/models/convnext/modeling_convnext.py in forward(self, pixel_values, labels, output_hidden_states, return_dict)
461 elif self.config.problem_type == "multi_label_classification":
462 loss_fct = BCEWithLogitsLoss()
--> 463 loss = loss_fct(logits, labels)
464 if not return_dict:
465 output = (logits,) + outputs[2:]
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
718
719 def forward(self, input: Tensor, target: Tensor) -> Tensor:
--> 720 return F.binary_cross_entropy_with_logits(input, target,
721 self.weight,
722 pos_weight=self.pos_weight,
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in binary_cross_entropy_with_logits(input, target, weight, size_average, reduce, reduction, pos_weight)
3161
3162 if not (target.size() == input.size()):
-> 3163 raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
3164
3165 return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
ValueError: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 25]))
I’m not familiar with these higher-level wrappers, so you might need to check the documentation of AutoModelForImageClassification
to see if and how the loss function can be changed, or you could try to create the model standalone and the desired criterion manually.
Thanks. but if i create the model standalone, can i retrieve pre-trained weights?
It appears that, during training, it is optional to input labels to the model. if you don’t input you can use your own loss function.
So instead of
outputs = model(input,labels)
# we can do
outputs = model(input) # and this will not and Can't calculate a loss