Can I train AI If AI model is located in the another model's forward?

Hi, I have model in the model of forward, as follows.


model2 = torchvision.models.video.r3d_18(pretrained=False).to('cuda')

class model1(torch.nn.Module):
    def __init__(self):
        super(model1, self).__init__()
        self.ReLU = ReLU(400, 128, 1)
        self.QoS = nn.Linear(128, 1)

    def forward(self, pixel):
        pixel = model2(pixel.cuda())
        pixel = self.ReLU(pixel)
        pixel = self.QoS(pixel)
        return pixel

Now, I’m training model1 by loss.backward()
However, the model2, located in the model1’s forward, is trained?
If not, How can I train the model 2?

My model 1 train code is as follows.

for idx, (raw_data, len_frame, true_mos) in enumerate(train_module):
            raw_data = raw_data.to(device).float()
            true_mos = true_mos.to(device).float()
            optimizer.zero_grad()
            outputs = model1(raw_data)
            loss = criterion(outputs, true_mos)
            loss.backward()
            optimizer.step()

Thanks.

I had similar case. When I checked the loss value of inner model’s output after each epoch, it was different (decreasing). This means that the model weights are changing which implies the model is training. So, as far as you don’t use detach() the model2 is training. To verify, you can also return the output of model2() and check the loss value (or the vectors themselves) after each epoch. Something like the following:

class model1(torch.nn.Module):
    def __init__(self):
        super(model1, self).__init__()
        self.ReLU = ReLU(400, 128, 1)
        self.QoS = nn.Linear(128, 1)

    def forward(self, pixel):
        model2_out = model2(pixel.cuda())
        pixel = self.ReLU(model2_out)
        pixel = self.QoS(pixel)
        return model2_out, pixel

for idx, (raw_data, len_frame, true_mos) in enumerate(train_module):
            raw_data = raw_data.to(device).float()
            true_mos = true_mos.to(device).float()
            optimizer.zero_grad()
            model2_out, outputs = model1(raw_data)
            if (idx == 0):
                print("model2_out:  ", model2_out)
            loss = criterion(outputs, true_mos)
            loss.backward()
            optimizer.step()

but if you have the ground truth data for model2 output, you can check loss value instead of the tensor values. Hope this helps

I think so. And it’s easily verifiable, you can proceed as @Ali1234 says, but it can also be done without the need to calculate an intermediate loss.

For example you can check the following quantity before and after optimizer.step() :

norm_check = (sum([p.norm(p=2).item() ** 2 for p in model2.parameters()]) ** 0.5

Or after loss.backward() you can check the gradient of one of the parameters of ``model2```

next(model2.parameters()).grad

There are some that are certainly more efficient…

1 Like