Propagating loss for simultaneous neural nets?

rrm39 · August 26, 2021, 4:02am

I am trying to create a script with nested neural nets, and I’m wondering how to ensure that the losses for each net are propagated to the right place.

As an overall structure to my code: I define an autoencoder net A, then train it for 100 epochs. Every 10 epochs, I evaluate the partially trained autoencoder net A on a data sample, then train the encoded data on a binary classifer net B.

I am worried that there is some interference between nets A and B. When I plot the loss v. epochs curve for net A, the curve gets a lot noisier when I train net B for a larger number of epochs, so I do think there is some interference going on there.

For A, I have:

xdevice = torch.device( "cuda" if torch.cuda.is_available() else "cpu" )
A.to( device )

then all the data for A is also sent to the device. For B, I do not send the net or the data to the device. But, since loss is propagated only with loss.backward(), I’m worried that the loss I get when training classifier B is getting backwarded to net A.

How can I control where the loss is sent? Do I need to define 2 devices for training?

ptrblck · August 26, 2021, 6:56am

This would be the case if you pass the output of netA to netB directly since both models are treated as any other nn.Module (e.g. nn.Conv2d).
If you want to train netB alone without calculating the gradients in netA and updating its parameters, you can detach netA’s output via:

out = netA(input)
out = netB(out.detach())
loss = criterion(out, target)
loss.backward() # will calculate the gradients in netB only