How to take the features of net1 as the input of net2

My code is based on Pytorch 0.3.
I want to take the features of net1(such as, the last layer but one), then feed them to net2.
Net1 and net2 should be trained simultaneously. And of course, the gradients from net2 will
have an influence on net1. (I modified the architecture of net2).
Here is the toy example:

import torch
from torch.autograd import Variable
import torch.nn

class my_net(nn.Module):
    def __init__(self):
        super(my_net, self).__init__()
        self.features = torch.nn.Sequential(
                            torch.nn.Conv2d(3, 5, 3, padding=1),
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(5, 10, 3, padding=1),
                            # Take the output of this layer as the input of net2
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(10, 15, 3, padding=1)
                            )
    def forward(self, x):
        x = self.features(x)
        return x

class my_net2(nn.Module):
    def __init__(self):
        super(my_net2, self).__init__()
        self.features = torch.nn.Sequential(
                            torch.nn.Conv2d(10, 25, 3, padding=1)
                            )
    def forward(self, x):
        x = self.features(x)
        return x

net1 = my_net().cuda()
net2 = my_net2().cuda()


input1 = Variable(torch.ones(1, 3, 10, 10).cuda())
out1 = net1(input1)

# input of net2 are features got from net1
input2 = ????????????
out2 = net2(input2)

# Create targets
target1 = Variable(torch.ones_like(out1.data))
target2 = Variable(torch.ones(1, 25, 10, 10)*2)


criterion = torch.nn.MSELoss(size_average=False)
loss1 = criterion(out1, target1)

# Or should I user another instance of MSELoss ?
loss2 = criterion(out2, target2)

loss = loss1+loss2
loss.backward()
class my_net(nn.Module):
    def __init__(self):
        super(my_net, self).__init__()
        self.features1 = torch.nn.Sequential(
                            torch.nn.Conv2d(3, 5, 3, padding=1),
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(5, 10, 3, padding=1),
                            )
                            # Take the output of this layer as the input of net2
         self.features2 = torch.nn.Sequential(
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(10, 15, 3, padding=1)
                            )
    def forward(self, x):
        x1 = self.features1(x)
        x2 = self.features2(x1)
        return (x1,x2)

Should be enough in my opinion

I am sorry for the confusion. I have updated my code. Thank you for your reply.

You could just use out1 and feed it into net2.
Alternatively, @Naman-ntc’s suggestion would also work (creating a new model with both models inside).

I think you misread the question?
Features just before out1 are required for net2 and not out1…!

Complete Code :

import torch
from torch.autograd import Variable
import torch.nn

class my_net(nn.Module):
    def __init__(self):
        super(my_net, self).__init__()
        self.features1 = torch.nn.Sequential(
                            torch.nn.Conv2d(3, 5, 3, padding=1),
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(5, 10, 3, padding=1),
                            )
                            # Take the output of this layer as the input of net2
         self.features2 = torch.nn.Sequential(
                            torch.nn.ReLU(),
                            torch.nn.Conv2d(10, 15, 3, padding=1)
                            )
    def forward(self, x):
        x1 = self.features1(x)
        x2 = self.features2(x1)
        return (x1,x2)

class my_net2(nn.Module):
    def __init__(self):
        super(my_net2, self).__init__()
        self.features = torch.nn.Sequential(
                            torch.nn.Conv2d(10, 25, 3, padding=1)
                            )
    def forward(self, x):
        x = self.features(x)
        return x

net1 = my_net().cuda()
net2 = my_net2().cuda()


input1 = Variable(torch.ones(1, 3, 10, 10).cuda())
feat_for_net2, out1 = net1(input1)

# input of net2 are features got from net1
input2 = feat_for_net2
out2 = net2(input2)

# Create targets
target1 = Variable(torch.ones_like(out1.data))
target2 = Variable(torch.ones(1, 25, 10, 10)*2)


criterion = torch.nn.MSELoss(size_average=False)
loss1 = criterion(out1, target1)

# Or should I user another instance of MSELoss ?
loss2 = criterion(out2, target2)

loss = loss1+loss2
loss.backward()

If @ptrblck has easier approach would be great :slight_smile:

Yes, I misunderstood your question, because I thought you want to use self.features. :wink:
I think your approach is fine. Since you need out1 for loss1, that would be the straightforward approach.

Still looking for a better way. As I may need to select outputs of multi-layers in net1. Wonder whether it can be implemented in a nicer way.

If you need to have more flexibility, you could have a look at forward hooks in this post.
Since you need the gradients, make sure not to detach the tensor.

1 Like

Pretty cool, even I needed something like this!
Thanks