My code is based on Pytorch 0.3.
I want to take the features of net1(such as, the last layer but one), then feed them to net2.
Net1 and net2 should be trained simultaneously. And of course, the gradients from net2 will
have an influence on net1. (I modified the architecture of net2).
Here is the toy example:
import torch
from torch.autograd import Variable
import torch.nn
class my_net(nn.Module):
def __init__(self):
super(my_net, self).__init__()
self.features = torch.nn.Sequential(
torch.nn.Conv2d(3, 5, 3, padding=1),
torch.nn.ReLU(),
torch.nn.Conv2d(5, 10, 3, padding=1),
# Take the output of this layer as the input of net2
torch.nn.ReLU(),
torch.nn.Conv2d(10, 15, 3, padding=1)
)
def forward(self, x):
x = self.features(x)
return x
class my_net2(nn.Module):
def __init__(self):
super(my_net2, self).__init__()
self.features = torch.nn.Sequential(
torch.nn.Conv2d(10, 25, 3, padding=1)
)
def forward(self, x):
x = self.features(x)
return x
net1 = my_net().cuda()
net2 = my_net2().cuda()
input1 = Variable(torch.ones(1, 3, 10, 10).cuda())
out1 = net1(input1)
# input of net2 are features got from net1
input2 = ????????????
out2 = net2(input2)
# Create targets
target1 = Variable(torch.ones_like(out1.data))
target2 = Variable(torch.ones(1, 25, 10, 10)*2)
criterion = torch.nn.MSELoss(size_average=False)
loss1 = criterion(out1, target1)
# Or should I user another instance of MSELoss ?
loss2 = criterion(out2, target2)
loss = loss1+loss2
loss.backward()
Yes, I misunderstood your question, because I thought you want to use self.features.
I think your approach is fine. Since you need out1 for loss1, that would be the straightforward approach.
If you need to have more flexibility, you could have a look at forward hooks in this post.
Since you need the gradients, make sure not to detach the tensor.