# How do I find the standard deviation of activations?

A good standard deviation for the activations is on the order of 0.5 to 2.0. Significantly outside of this range may indicate one of the problems mentioned above.

How can I find the std of my activations using PyTorch?

I can find the gradients like this, and they look a bit small so I want to investigate this further, or do they really look small?

``````for i in model.named_parameters():
print("Layer name: ", str(i))
``````

Output:

Layer 1

Layer 4

Layer out

You could use forward hooks to print the std deviation of the layer outputs:

``````class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 3, 1, 1)
self.pool1 = nn.MaxPool2d(2)
self.conv2 = nn.Conv2d(6, 1, 3, 1, 1)
self.pool2 = nn.MaxPool2d(2)
self.fc = nn.Linear(6*6, 10)

def forward(self, x):
x = F.relu(self.conv1(x))
x = self.pool1(x)
x = F.relu(self.conv2(x))
x = self.pool2(x)
x = x.view(x.size(0), -1)
x = self.fc(x)
return x

model = MyModel()
model.conv1.register_forward_hook(lambda m, x, out: print(out.std()))
model.conv2.register_forward_hook(lambda m, x, out: print(out.std()))
model.fc.register_forward_hook(lambda m, x, out: print(out.std()))
x = torch.randn(1, 3, 24, 24)
output = model(x)
``````
1 Like

Cool, thanks! Just what I was looking for.

Any comments on my gradients? Or is this a case-to-case thing where there are no right or wrong gradients?

1 Like

Stanford’s CS231n states that the ratio of weights:updates should be roughly `1e-3` (source).