Num_batches_tracked in pytorch 1.0

Teeyo · December 19, 2018, 1:40pm

I’ve checked the parameters of my model in the pytorch 1.0 version, so there is a parameter named “num_batches_tracked” in every BN layer. Just as follows:
bn1.weight
bn1.bias
bn1.running_mean
bn1.running_var
bn1.num_batches_tracked

However, my pretrained model is from 0.3.1 version. I also checked the parameters of the pretrained model but found that there is not a parameter named “num_batches_tracked”. There are only four parameters in each BN layer:

bn1.weight
bn1.bias
bn1.running_mean
bn1.running_var

So when I load the pretrained model directly, it doesn’t match.

How can I solve this problem ? I also don’t know why there is a new param in BN layer in the pytorch >=0.4.0 versions. What for ?

Sunshine352 · December 19, 2018, 1:48pm

modelB = TheModelBClass(*args, **kwargs)
modelB.load_state_dict(torch.load(PATH), strict=False)

Partially loading a model or loading a partial model are common scenarios when transfer learning or training a new complex model. Leveraging trained parameters, even if only a few are usable, will help to warmstart the training process and hopefully help your model converge much faster than training from scratch.

Whether you are loading from a partial state_dict , which is missing some keys, or loading a state_dict with more keys than the model that you are loading into, you can set the strict argument to False in the load_state_dict() function to ignore non-matching keys.

If you want to load parameters from one layer to another, but some keys do not match, simply change the name of the parameter keys in the state_dict that you are loading to match the keys in the model that you are loading into.