This is my code to set the seed values right after the imports:
def seed_everything(seed):
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
np.random.seed(seed) # Numpy module.
random.seed(seed) # Python random module.
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.use_deterministic_algorithms(True)
seed_everything(42)
My environment:
PyTorch version: 2.0
OS: MacOS M1
Device: MPS
This is how I define my model:
num_labels = 3
hidden_size = 768
intermediate_size = 800
class BertEncoder(nn.Module):
def __init__(self):
super(BertEncoder, self).__init__()
self.encoder = BertModel.from_pretrained('bert-base-multilingual-uncased')
def forward(self, x, mask=None):
outputs = self.encoder(x, attention_mask=mask)
feat = outputs[0][:, 0, :]
return feat
class BertClassifier(nn.Module):
def __init__(self, dropout=0.1):
super(BertClassifier, self).__init__()
self.dropout = nn.Dropout(p=dropout)
self.classifier = nn.Linear(hidden_size, num_labels)
# self.softmax = nn.Softmax(dim=1)
#self.apply(self.init_bert_weights)
def forward(self, x):
x = self.dropout(x)
out = self.classifier(x)
# out = self.softmax(x)
return out
def init_bert_weights(self, module):
""" Initialize the weights.
"""
if isinstance(module, (nn.Linear, nn.Embedding)):
module.weight.data.normal_(mean=0.0, std=0.02)
if isinstance(module, nn.Linear) and module.bias is not None:
module.bias.data.zero_()
src_encoder = BertEncoder()
src_classifier = BertClassifier()
src_encoder = src_encoder.to(device)
src_classifier = src_classifier.to(device)
I’m using GPU to train my model and I have also set num_workers = 0 in the Training and Validation dataloader. Despite all of this, I’m still not able to reproduce my training losses and F1 scores.
My training output on the 1st run:
Epoch: 0/3
Epoch [00/03] Step [000/127]: cls_loss=1.0705
Epoch [00/03] Step [005/127]: cls_loss=0.9886
Epoch [00/03] Step [010/127]: cls_loss=0.8697
Epoch [00/03] Step [015/127]: cls_loss=1.1442
Epoch [00/03] Step [020/127]: cls_loss=0.9821
Epoch [00/03] Step [025/127]: cls_loss=0.8301
Epoch [00/03] Step [030/127]: cls_loss=0.9174
Epoch [00/03] Step [035/127]: cls_loss=0.8881
Epoch [00/03] Step [040/127]: cls_loss=0.7934
Epoch [00/03] Step [045/127]: cls_loss=1.0184
Epoch [00/03] Step [050/127]: cls_loss=1.0952
Epoch [00/03] Step [055/127]: cls_loss=0.9670
Epoch [00/03] Step [060/127]: cls_loss=0.8665
Epoch [00/03] Step [065/127]: cls_loss=0.7878
Epoch [00/03] Step [070/127]: cls_loss=0.6154
Epoch [00/03] Step [075/127]: cls_loss=0.8608
Epoch [00/03] Step [080/127]: cls_loss=0.7064
Epoch [00/03] Step [085/127]: cls_loss=0.7867
Epoch [00/03] Step [090/127]: cls_loss=0.7772
Epoch [00/03] Step [095/127]: cls_loss=0.6452
Epoch [00/03] Step [100/127]: cls_loss=0.5981
Epoch [00/03] Step [105/127]: cls_loss=0.7518
Epoch [00/03] Step [110/127]: cls_loss=0.7248
Epoch [00/03] Step [115/127]: cls_loss=1.0563
Epoch [00/03] Step [120/127]: cls_loss=0.7010
Epoch [00/03] Step [125/127]: cls_loss=0.7213
At the end of Epoch: 0
Validation loss: 0.6659477949142456
Accuracy: 0.6971046770601337
F1 score (Macro): 0.6691835627250752
F1 score (Per class): [0.61 0.62931034 0.76824034]
And, the training output after the 2nd run:
Epoch: 0/3
Epoch [00/03] Step [000/127]: cls_loss=1.0752
Epoch [00/03] Step [005/127]: cls_loss=0.9756
Epoch [00/03] Step [010/127]: cls_loss=0.9635
Epoch [00/03] Step [015/127]: cls_loss=1.1132
Epoch [00/03] Step [020/127]: cls_loss=0.9640
Epoch [00/03] Step [025/127]: cls_loss=0.9263
Epoch [00/03] Step [030/127]: cls_loss=0.9199
Epoch [00/03] Step [035/127]: cls_loss=0.9258
Epoch [00/03] Step [040/127]: cls_loss=0.9136
Epoch [00/03] Step [045/127]: cls_loss=1.1773
Epoch [00/03] Step [050/127]: cls_loss=1.2147
Epoch [00/03] Step [055/127]: cls_loss=1.0307
Epoch [00/03] Step [060/127]: cls_loss=0.9063
Epoch [00/03] Step [065/127]: cls_loss=0.7165
Epoch [00/03] Step [070/127]: cls_loss=0.7686
Epoch [00/03] Step [075/127]: cls_loss=0.9018
Epoch [00/03] Step [080/127]: cls_loss=0.7115
Epoch [00/03] Step [085/127]: cls_loss=0.8505
Epoch [00/03] Step [090/127]: cls_loss=0.7284
Epoch [00/03] Step [095/127]: cls_loss=0.6582
Epoch [00/03] Step [100/127]: cls_loss=0.6921
Epoch [00/03] Step [105/127]: cls_loss=0.8489
Epoch [00/03] Step [110/127]: cls_loss=0.7658
Epoch [00/03] Step [115/127]: cls_loss=0.9741
Epoch [00/03] Step [120/127]: cls_loss=0.9331
Epoch [00/03] Step [125/127]: cls_loss=0.7483
At the end of Epoch: 0
Validation loss: 0.7412455081939697
Accuracy: 0.6859688195991092
F1 score (Macro): 0.6511615399484937
F1 score (Per class): [0.55681818 0.63934426 0.75732218]
As you can see, the losses and F1 scores are not equal at all. Why am I not able to reproduce my results? Is it because of M1 GPU or the MPS device? This is why I installed the latest version of Pytorch 2.0. But, I’m still unable to get the same results in subsequent training runs. What is the reason for this problem? Am I missing something? Please help. Thanks!