I notice nn.Parameters could require NO gradients.

Then i feel confused what’s the application difference between `nn.Parameters(Tensor(3,4), required_grad=False)`

and `self.register_buffer(name, tensor(3,4))`

?

I guess that a buffer is not supposed to be fixed (you may change its value during iterations of an algorithm), while a parameter is fixed, but may only be modified through a gradient descent.

For example in an exponential moving average:

```
Z = mu * Z + (1-mu) * X(n)
```

`Z`

should be a buffer and mu a parameter, while none of them would require gradients. Then, as far as I know, there is no functional difference, it’s rather a convention.

Hi, nn.Parameters Could also requires no gradient, Im curious the difference of nn.Parameters when it requires no gradient.