I’m somewhat new to PyTorch, so apologies if I’m missing something obvious here.
I am trying to create two models using the same neural network template, but with different initial inputs to ensure they diverge. With my code, however, I find the state_dicts of either model are still the same after the initialization step.
Parts of my code are abbreviated, but I can elaborate further if needed.
#Creating models
model1 = CrystalGraphConvNet(orig_atom_fea_len, nbr_fea_len, ...)
model2 = CrystalGraphConvNet(orig_atom_fea_len, nbr_fea_len, ...)
optimizer1 = optim.SGD(model1.parameters(), args.lr,
momentum=args.momentum,
weight_decay=args.weight_decay)
optimizer2 = optim.SGD(model2.parameters(), args.lr,
momentum=args.momentum,
weight_decay=args.weight_decay)
def train(train_loader, epoch):
#Input1, target1 and input2, target2 come from zipping pairs of train_loader rows
init_input1 = (Variable(input1[0]), Variable(input1[1]),input1[2], input1[3])
init_input2 = (Variable(input2[0]),Variable(input2[1]), input2[2],input2[3])
target_normed1 = normalizer.norm(target1)
target_normed2 = normalizer.norm(target2)
init_target1 = Variable(target_normed1)
init_target2= Variable(target_normed2)
#Initialization Step
optimizer1.zero_grad()
output1 = model1(*init_input1)
loss1 = criterion(output1, init_target1)
loss1.backward()
optimizer1.step()
init_state1 = model1.state_dict()
optimizer2.zero_grad()
output2 = model2(*init_input2)
loss2 = criterion(output2, init_target2)
loss2.backward()
optimizer2.step()
init_state2 = model2.state_dict()
Despite doing optimizer.step() to make the gradient changes permanent, init_state1 and init_state2 are the same afterwards.
I’ve tested this with random pairs of initial inputs, so there shouldn’t an issue with how the inputs are paired. I’ve also tried “model2 = copy.deepcopy(model1)” to no avail, so I don’t think there’s an aliasing issue either.