About Normalization using pre-trained vgg16 networks

ptrblck · October 28, 2020, 10:31am

It depends how large the dataset is and how large each latent tensor is.
If you cannot store all tensors during training, you would have to calculate the stats on the fly.

Here is an example of using forward hooks.