I have a UNet autoencoder that branches at the bottom of the cup, to a dense layer that does classification.
When I train my model, everything works fine and my network produces two outputs: the output based off of the bottleneck branch, and the final autoencoded output.
However when I test my model using:
model.eval()
inputs = Variable(inputs.cuda(), volatile=True)
What I’ve noticed is that the variables coming out of the autoencoder are fine, but the results coming off of the linear are all nan. I tried removing volatile=True, no difference. I altered my model’s .forward() such that the classifier branch’s Linear is computer before / after the autoencoder, no difference.
Now, if I remove model.eval() and leave it with model.train(True), everything works!
I don’t call loss.backward() or step through my optimizer in train mode, and since I can use volatile=True I don’t really care either way which ‘mode’ the model is in. But I found the behavior interesting =). Maybe someone can shed some light.