Frozen Model is Affected by Optimizer

paarandika · March 19, 2020, 11:32am

I have a trained a CRNN model containing LSTM, batch norm and convolutional layers. Using the trained model I’m trying to train the input of the model. (Trained model is loaded from the disk)

image = torch.rand(1, 1, input_size[0], input_size[1]).cuda()
loss_function = CTCLoss().to(device)
optimizer = optim.Adam([image.requires_grad_()], lr=lr)

Image is saved after training. This saved image generates a different output with a newly loaded model than the model used to train it. By using a single image I discovered that this difference happens only if optimizer.step() is called.
Am I missing something here? Does optimizer change the model even this case? (I haven’t included model params in optimizer)

yash1994 · March 19, 2020, 11:52am

Hi,

I am not getting what you are trying to achieve by “I’m trying to train the input of the model.”. When optimizer.step() is called, the gradients are accumulated to image tensor which changes the value of the tensor. So I am guessing that your input (image tensor) is getting changed and not your model.

paarandika · March 19, 2020, 12:09pm

I’m trying to do something similar to this tutorial.
Basically my intention is to find the optimal image expected by the model, given a label.

" So I am guessing that your input ( image tensor) is getting changed and not your model."
This is exactly what I want. But it appears that it changes the model as well. That’s the problem I’m having.

yash1994 · March 19, 2020, 12:25pm

Hi,

I understood your problem, but it’s hard to tell with just three lines of code. You can debug your optimizer by checking opt.param_groups, it will give you all the parameter it tries to optimize. Hope this helps.

paarandika · March 20, 2020, 4:53am

Thanks. I will try that and post the results if I find anything weired in there.

paarandika · March 25, 2020, 3:52am

I figured out the issue here. CRNN model has two batchNorm layers. They behave differently in train and eval modes.