Recommended ways to calculate statistics of model?

kirk86 · March 9, 2020, 12:48am

I was wondering if anyone has any suggestions on recommend ways to calculate some statistics of a model which do not require gradient to be calculated for those?

I’ve been looking at some code base which does this for vgg16 but the way they are doing it seems a bit odd to me.

First they instantiate the model, then thy instantiate a replica of the same model which they pass into a class ComputeStats(nn.Module).

Second, the above class in its init(), actually strips away all the params of the replica model and puts in their place placeholds for the stats to be computed using zero tensors applying the following function recursively to every module of the replica model.

def module_parameters(module, params):                 |                                                                                                               
      for name in list(module._parameters.keys()):
          if module._parameters[name] is None:
             continue

          data = module._parameters[name].data
          module._parameters.pop(name)                                  

         module.register_buffer("%s_dummy1" % name, data.new data.size()).zero_())
         module.register_buffer("%s_dummy2" % name, data.new(data.size()).zero_())
                                                                                                                                                                                                                                                      
         params.append((module, name))

Finally, in the class ComputeStats(nn.Module) in the forward function they just return the forward of the replica model.
def forward(self, *args): return self.replica(*args)

MWE:

1. model = vgg16()
2. replica = vgg16()
3. class ComputeStats(replica)
4. in the __init__() of ComputeStats execute replica.apply(lambda module: module_parameters(module=module, params=list()))
5. in the forward of ComputeStats::forward(self, *args): return replica(*args)

Is this practice common or is a better simpler way to achieve this?