I have model with a nn.TransformerEncoderLayer() with some lstm layers after it. When ever i go to test the model after saving it and loading it in another file, with the TransformerEncoderLayer in the model, depending on the shuffle of the input data I get different accuracy.
If I train the model without that layer the model save’s, loads, and evals() with the same accuracy no matter the order.
Will the TransformerEncoderLayer always give me different accuracy like this? I even printed out the weights of this layer and they all saved correctly.