Backward() and train a part of a model

amirhf · July 29, 2019, 8:18pm

Hi,

I’m trying to train some part of a model like below:

input is high dimensional and I’m using a simple NN on a small partition of the data (f0) to reduce dimension and then concatenate with the rest of the data (f1) on which nothing has been applied. then I pass them to a frozen model M (pretrained Keras model) and then get the binary output and then I calculate loss. I just want to train the NN model and do backward on its parameters. Is this possible?

ptrblck · July 29, 2019, 9:46pm

The workflow would be possible in PyTorch.
However, if seems you would like to mix PyTorch with a pretrained Keras model afterwards, which won’t work (at least I’m not aware of such a workflow and if someone actually has used it before).

Would it be possible to port the Keras parameters to a PyTorch model?

amirhf · July 29, 2019, 10:03pm

Thank you for the response.

So there are some layers in my Keras model that are custom defined and the existing libraries don’t support custom layers when converting. But, how would I do that if M was a pretrained PyTorch model?

Also, is it because the output of the Keras model is a numpy array and doesn’t keep the grad_fn and requires_grad that this is not possible?

ptrblck · July 29, 2019, 10:20pm

A dummy approach would be:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc = nn.Linear(4, 2)
        self.base = models.resnet50()
        
    def forward(self, x):
        # split x to x0, x1
        x0, x1 = x[:, :4], x[:, 4:]
        x0 = self.fc(x0)
        # Concatenate
        x = torch.cat((x0, x1), 1)
        # reshape to fit resnet50 input shape
        x = x.view(x.size(0), 3, 224, 224)
        x = self.base(x)
        return x

model = MyModel()
x = torch.randn(1, 3*224*224 + 2)
output = model(x)

Note that I’ve used resnet50, which expects image tensors as the input, thus I had to reshape the concatenated tensor to [batch_size, 3, 224, 224].
If you are dealing with a pretrained model using only linear layers, you wouldn’t have to reshape it.

Yes, basically you would have to convert the concatenated PyTorch tensor to a numpy array, which will detach the computation graph, so that you won’t be able to calculate the gradients using the final loss.

amirhf · July 29, 2019, 10:25pm

Thank you so much. This clarified a lot of things and helps me a lot!

ptrblck · July 29, 2019, 10:25pm

You’re welcome.
Let us know, if you need help porting the code to train the model end-to-end

amirhf · July 29, 2019, 10:39pm

Do you mean porting the Keras model? Do you think it’s possible to port a trained Keras model (with custom layers) to PyTorch?

ptrblck · July 29, 2019, 10:43pm

It depends, what kind of custom layers there are, but as long as there are equivalent PyTorch methods, it should work.
There might be some pitfalls, e.g. flipped kernels etc., but users in this forum ported successfully models before.

amirhf · July 29, 2019, 10:45pm

Sounds like a good project. I’ll research a bit more (this forum and web) to see what I can do. Thanks for the help again. I’ll reach out if I needed help