TransformerEncoderLayer Neural Network model accuracy changes based on batch size

I have trained a Neural Network model with a TransformerEncoderLayer in it. I have the model saved, but when I go to load the model and evaluate the model, the accuracy changes based on the batch size. Why is this?

Can you show a simple example (with some reproducible code) that illustrates this?

It’s just any time I use nn.TransformerEncoderLayer in anyway with a saved model if the data is in a different order i get different results. Is there a way to save the Encode table, this would be in the MultiheadAttention part of the TransformerEncoderLayer right?

I’m some what new to Transformers.

edit:
just using TransformerEncoderLayer and save the model and then use np.random.permutation() to shuffle the input data. This always gives me different results unless I use the same order every time.

i have this layer in my model like this self.transformer = nn.TransformerEncoderLayer()
and save the model like so
torch.save(model, path)
does this not save the nn.TransformerEncoderLayer() or something?

Typically one saves the state_dict, per the instructions:

torch.save(model.state_dict(), path)
1 Like

They both do the same thing, my way just allows you not to have to redefine the model. Sorry for the late response I got sick.

It has something to do with the nn.TransformerEncoderLayer() because it works fine if i take this layer out

Got it, what is the shape of your input, and are you using batch_first=True or not? Basically one thing to just make sure of is that you don’t have your batch and the sequence dimensions mixed up in your implementation.

batch_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False (seq, batch, feature).

Thank you! this might be my problem