I have defined my model in the following way:
class Net(nn.Module):
def __init__(self, pre_classifier_init, classifier_init):
super(Net, self).__init__()
self.pre_classifier = nn.Linear(768, 768)
self.classifier = nn.Linear(768, 2)
self.dropout = nn.Dropout(0.1)
self.pre_classifier.weight.data.copy_(pre_classifier_init.weight.data)
self.classifier.weight.data.copy_(classifier_init.weight.data)
def forward(self, x, labels = None):
x = self.pre_classifier(x)
x = nn.ReLU()(x)
x = self.dropout(x)
logits = self.classifier(x)
loss = None
if labels is not None:
loss_fct = CrossEntropyLoss()
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
return SequenceClassifierOutput(
loss = loss,
logits = logits
)
net = Net(model.pre_classifier, model.classifier1)
net.eval()
As you can see, I am passing parameters to the constructor to set the weights for the 2 layers in the model each time I instantiate it; as well as putting the model in evaluation mode to turn off the effects of the dropout layer. With this setting, I would thus expect there to be no randomness involved and for my results to be deterministic. However the performance of my model varies each time I instantiate it so there still must be some randomness going on.
I confirm this by running this code beforehand:
seed = 0
torch.manual_seed(seed)
So where is this randomness coming from?