How to put the tensors belonging to the model into cuda

I have a model and I have put the model into cuda depending on the availability of gpu. If gpu is available, I put the model and the inputs x into cuda but the tensors in class A which are part of the model are getting loaded into cpu. How to put all the tensors used in class A into cuda if gpu is available? Please help!

class A(nn.Module):

def __init__(self, dim):

    super().__init__()

    self.dim = dim



    Q = torch.nn.init.orthogonal_(torch.randn(dim, dim))

    P, L, U = torch.lu_unpack(*Q.lu())

    self.P = P # remains fixed during optimization

    self.L = nn.Parameter(L) # lower triangular portion

    self.S = nn.Parameter(U.diag()) # "crop out" the diagonal to its own parameter

    self.U = nn.Parameter(torch.triu(U, diagonal=1)) # "crop out" diagonal, stored in S

def _assemble_W(self):

    """ assemble W from its pieces (P, L, U, S) """

    L = torch.tril(self.L, diagonal=-1) + torch.diag(torch.ones(self.dim))

    U = torch.triu(self.U, diagonal=1)

    W = self.P @ L @ (U + torch.diag(self.S))

    return W

def forward(self, x):

    W = self._assemble_W()

    z = x @ W

    log_det = torch.sum(torch.log(torch.abs(self.S)))

    return z, log_det

All nn.Parameters and registered buffers will be moved to the device in .cuda()/.cpu() and .to() operations. This is also the case for self.L, S, U, however, self.P will remain on the CPU, since it’s neither a parameter nor buffer.
If you don’t want to train this tensor, use self.register_buffer("P", P) to register the tensor as a buffer.