# The gradient norm and the noise

I am training ResNet101 on tiny-imagenet. How to compute gradient l2 norm and the noise of the gradient (see the last two strings) per epoch? L is a classic loss function.

Firstly I need to compute them per iteration and then average. Is it OK for gradient norm?

``````for batch_num, (X, y) in enumerate(train_loader):
X = X.to(device)
y = y.to(device)