Am I using Buffers correctly?

I am working on a project in Pyro, and am basing my code on their DMM example. In their case they are doing model learning, so their initial hidden state is initialized as

self.z_0 = nn.Parameter(torch.zeros(z_dim))

which means that z_0 is both learnable and will be transferred to GPU if the whole DMM is. In my case I am not doing model learning, so want a fixed z_0, but also want it to be transferred to the device. Some searching has led me to believe that initializing z_0 as a Buffer achieves this, but I’m not sure if there are some side effects or another language feature which is more suited to my use case?