Hi, I am a beginner with Pytorch, and the following code confuses me a lot.
Instead of using two seperate sub-models with one optimizer/scheduler each, I want to use an ensemble of these two sub-models, so that I only need to write mymodel.train()/eval()
, mymodel.cuda()/cpu()
or torch.save(mymodel, 'mymodel.pt')
rather than two copies of those for two sub-models. Besides, I only need one optimizer/scheduler. Here is the demo code:
class MyModel(nn.Module):
def __init__(self, subnet1, subnet2):
super(MyModel, self).__init__()
self.subnet1 = subnet1
self.subnet2 = subnet2
def loss(self, sample):
# calculating the loss using self.subnet1 and self.subnet2
......
return loss_tensor
mymodel = MyModel(net1, net2)
optimizer = torch.optim.Adam(mymodel.parameters(), lr=LR)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=step_size, gamma=gamma)
There is no forward method in MyModel
class since subnet1
and subnet2
are nn.Module
with a forward method, And I use the loss.backward()
and optimizer/scheduler to train my ensemble.
BUT, I found the accuracy is ~8% lower than the seperate two-sub-model version on the same ground (the latter one with two optimizers/schedulers). Is there sth Iâve missed? or just the bad luck for the degradation.
Furthermore, Is there any available ways to integrate my sub-models?