I’m trying to wrap my head around how to use nn.LayerNorm(). As I understand it, Layer Normalization takes the weights of a hidden layer and rescales them around the mean and standard deviation. Correct so far?

For example, let’s assume a simple plain vanilla feed-forward network.

def **init**(self, input_size, neurons, num_classes):

super(NeuralNet, self).**init**()

self.fc1 = nn.Linear(25, 10, bias=True) # input layer with 25 features

self.fc2 = nn.Linear(10, 10, bias=True) # hidden layer with 10 neurons

self.fc3 = nn.Linear(10, 2, bias=True) # output layer with 2 outputs

```
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
return torch.sigmoid(self.fc3(x))
```

I want to add nn.LayerNorm to the layers (fc1 & fc2). Should I add it into the forward method?

```
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc1.nn.LayerNorm(?????)
x = F.relu(self.fc2(x))
x = self.fc2.nn.LayerNorm(?????)
return torch.sigmoid(self.fc3(x))
```

What I don’t understand are the arguments/parameters needed… I’ve read the documentation: torch.nn.LayerNorm(*normalized_shape*, *eps=1e-05*, *elementwise_affine=True*, *device=None*, *dtype=None* ) Using my example, what is the normalized_shape for fc1? for fc2?

be gentle, I’m a newbie THANK YOU for taking the time to read this and to HELP ME!