Pytorch 0.4.0 grad_fn has no running_mean and running_var for batchnorm layer

The grad_fn has the information for gamma and beta which are stored using weight and bias variable. However, I have not found any information for running_mean and running_var. Is it possible to find the value of running_mean and running_var? And why the running_mean and running_var are removed from grad_fn?

The above problem I encounter is when I want to convert pytorch trained model to a caffe model. I can extract the prototxt from the compute graph based on grad_fn. But stuck when I want to obtain the running_mean and running_var for batchnorm layer.

You can use the state_dict function to get all the data related to the model.

The state_dict() can return the data, however it has nothing about the how the forward function works(e.g. the compute graph). It only contains the key word when defining the network.