I was going through this hugging face code and I am having trouble understanding what loss the model is currently using . Although I know most seq2seq models uses CrossEntrophy loss but I don’t see the definition anywhere in the code
Thanks for the quick reply, actually currently I am using pre-trained llama models from hugging face. I want to fine-tune llama with a weighted loss function. Any idea how I can integrate it in the transformer library? I found some links related to that but it does not seem to working.
You could subclass the model and reimplement the forward with your modification or so.
However, I do think that fundamentally, transformers is a library targeted towards people using the models as-is. The other two repos that I linked (and A. Karpathy’s earlier MinGPT both have as a reference point) deliberately want do differently: there, it is intended that you take and modify the code rather than using them just as a library.