Multi Output Validation (volatile)

uapatira · June 1, 2017, 12:21am

I have a UNet autoencoder that branches at the bottom of the cup, to a dense layer that does classification.

When I train my model, everything works fine and my network produces two outputs: the output based off of the bottleneck branch, and the final autoencoded output.

However when I test my model using:

model.eval()
inputs  = Variable(inputs.cuda(),  volatile=True)

What I’ve noticed is that the variables coming out of the autoencoder are fine, but the results coming off of the linear are all nan. I tried removing volatile=True, no difference. I altered my model’s .forward() such that the classifier branch’s Linear is computer before / after the autoencoder, no difference.

Now, if I remove model.eval() and leave it with model.train(True), everything works!

I don’t call loss.backward() or step through my optimizer in train mode, and since I can use volatile=True I don’t really care either way which ‘mode’ the model is in. But I found the behavior interesting =). Maybe someone can shed some light.

smth · June 1, 2017, 9:15pm

the problem is that one of your input samples during training contained a NaN value. The running_mean and running_std buffers of BatchNorm layers hence likely contain NaNs.
Because of this, in inference mode the outputs through the linear layer are NaNs.