Hi

This is probably just me missing out on some critical information, I am getting some NaNs (loss -> inf) in my loss function, so I decided to try to investigate where these weights come from. And in the process I tried to print out the sum of all the different parameters in my mode (alongside with their dims)

```
def sum_params(model):
s = []
for p in model.parameters():
dims = p.size()
n = p.cpu().data.numpy()
s.append((dims, np.sum(n)))
return s
```

Now for the embedding layer I got the tuple: (torch.Size([20000, 300]), -38459.16)

However I have previously initialized the embedding as (tried both methods, not sure what

is “correct” in pytorch)

```
init.uniform(self.embedding.weight, -0.1, 0.1)
self.embedding.weight.data.uniform_(-0.1, 0.1)
```

So the sum of weights should be “close” to 0 as its expected value is 0.

By explicitly calling sum() on the self.embedding.weight I got the value -48.4358 that seems

more legit. Order of calls: init(), sum()->correct size, model.parameters() -> wrong size.

When I after this start training the sum() immediately matches the sum of model.parameters().

*this is also observed in the first forward pass BEFORE any backprob is done*.

So I am just wondering what am I missing? Why is my weights overwritten when I start

doing forward passes? The same happens when I remove all backprob/training.

Again, this is probably just missing some info, but I have a hard time wrapping my head around this,

and it is annoying as I would like to migrate from tensorflow to this excellent library