model.train(True) and model.train(False) give different results for the same input

A layer doesn’t have requires_grad, only Variables have. running_mean and running_var are buffers, and are updated during forwarding. I assume train(True) will still use the batch mean and batch var.