TypeError: expected CPU (got CUDA) when implementing simple autoencoder class

sayadennis · September 13, 2021, 10:12pm

Hi,

I am trying to implement a simple autoencoder class. I want to run it on GPU, and therefore have tried to explicity move everything to CUDA (using “.to(self.device)”). However, I’m still getting an “expected CPU” error. Could I get help identifying what additional parts need to be moved to cuda? The class looks like the below:

class AE(torch.nn.Module):
    def __init__(self, Xtr, Xval,
                 eps = 1e-7, weight_decay = 1e-5, C=1., lr = 1e-2, 
                 device=torch.device('cuda')):
        super(AE, self).__init__()
        self.weight_decay = weight_decay
        self.lr = lr
        self.C = C
        self.device = device
        self.__initfact__(Xtr, Xval)

    def __initfact__(self, Xtr, Xval):
        self.ntr = torch.tensor(Xtr.shape, dtype=int)[0].to(self.device)
        self.nval = torch.tensor(Xval.shape, dtype=int)[0].to(self.device)
        self.m = torch.tensor(Xtr.shape, dtype=int)[1].to(self.device)
        self.Xtr = torch.from_numpy(Xtr).float().to(self.device)
        self.Xval = torch.from_numpy(Xval).float().to(self.device)

        self.loss_rcs = nn.MSELoss().to(self.device)

        self.encoder = nn.Sequential(
            nn.Linear(self.m, 2048).to(self.device), 
            nn.ReLU().to(self.device),
            nn.Linear(2048, 1024).to(self.device)
        ).to(self.device)
        self.decoder = nn.Sequential(
            nn.Linear(1024, 2048).to(self.device),
            nn.ReLU().to(self.device),
            nn.Linear(2048, self.m).to(self.device)
        ).to(self.device)
        
        self.opt = torch.optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.weight_decay)
    
    def forward(self, X):
        X = self.encoder(X)
        X = self.decoder(X)
        return X

When I initialize this class with Numpy arrays Xtr Xval, this is the error that I get:

raceback (most recent call last):
  File "pretrain/modeling/pretrain_ae.py", line 61, in <module>
    m = AE(
  File "/home/srd6051/pretrain/modeling/AE.py", line 33, in __init__
    self.__initfact__(Xtr, Xval)
  File "/home/srd6051/pretrain/modeling/AE.py", line 73, in __initfact__
    nn.Linear(2048, self.m).to(self.device)
  File "/home/srd6051/anaconda3/envs/bbcarenv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 81, in __init__
    self.bias = Parameter(torch.Tensor(out_features))
TypeError: expected CPU (got CUDA)

I am using pytorch version 1.8.1. I’m happy to provide any additional information that might be useful. Any insight would be greatly appreciated! Thank you in advance.

ptrblck · September 14, 2021, 12:17am

You wouldn’t need to call to(device) on each tensor and module and should make sure to register them properly instead so that a single model.to(device) operation would move all internal data to the specified device.

Some recommendations:

Remove self.m = torch.tensor(Xtr.shape, dtype=int)[1].to(self.device) if this tensor is only needed to provide a shape and pass the shape information as an int instead to the layer.
I’m not sure why you are initializing data inside the model via self.Xtr = torch.from_numpy(Xtr).float().to(self.device), but would define it outside the model.
Remove the to(device) ops on each layer, create the model once and move it to the device via model.to(device).

This should work instead:

class AE(torch.nn.Module):
    def __init__(self, m):
        super(AE, self).__init__()
        
        self.encoder = nn.Sequential(
            nn.Linear(m, 2048), 
            nn.ReLU(),
            nn.Linear(2048, 1024)
        )
        self.decoder = nn.Sequential(
            nn.Linear(1024, 2048),
            nn.ReLU(),
            nn.Linear(2048, m)
        )
        
    def forward(self, X):
        X = self.encoder(X)
        X = self.decoder(X)
        return X
    

model = AE(10)
device = 'cuda'
model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
x = torch.randn(1, 10, device=device)

out = model(x)
out.mean().backward()
optimizer.step()

sayadennis · October 6, 2021, 10:57pm

I apolgize for the late follow-up. Thank you @ptrblck, this worked for me! I appreciate the help.