IndexError: index out of range in self for AlbertForTokenClassification

cantbelieveimshook · November 27, 2022, 7:32pm

When converting columns of csv files into torch token and label tensors, I got this error when attempting to run the AlbertForTokenClassification model on the token tensor.

Code:

I tried rotating the input so it is [256, 34546] instead of [34536, 256], but that did not work, either. I ended up getting the error “RuntimeError: The expanded size of the tensor (34546) must match the existing size (256) at non-singleton dimension 1. Target sizes: [256, 34546]. Tensor sizes: [1, 256].”

Any help would be appreciated, thanks

cantbelieveimshook · November 27, 2022, 9:50pm

Update: I have figured out the issue. I had used a model with a different pretrained dataset to the default one, and I think that resulted in a model with a smaller vocabulary, which resulted in some of my train_token_ids tensor max values being larger than the max allowed value. If you’re experiencing this issue and scouring other threads hasn’t helped, maybe check your model and determine its vocab size if possible.