Is backward pass is determined from the forward pass alone

alekhka · April 5, 2019, 10:21pm

Hi,

I have been using PyTorch for a while now and one of the things I find attractive is that I can “bypass” few layers by not forward passing through it all. My understanding was that by not including them in the forward pass they would automatically be dropped off from the network and wouldn’t be backpropped through. Is that true? But why do these layers still show when I do print(str(model))? i.e layers initialised in the init call are still present?

Thank you!

ptrblck · April 6, 2019, 12:28pm

Basically you are right. The “missing” operations won’t be tracked by Autograd and thus the backward pass won’t be performed on them either.
However, these layers are still registered as parameters to the model, so they will show up in the print statement as well as in model.parameters().
Since the PyTorch is a dynamic framework, the computation graph will be re-created in each forward pass. So the next forward pass could indeed use the left out layers and e.g. train them in every second iteration.