Expected all tensors to be on the same device, but found at least two devices

HassanAli · April 17, 2022, 4:24am

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument target in method wrapper_nll_loss_forward)

My model and inputs both are already on gpu. But I am still getting this error.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = testnetwork()
model.to(device)

X_s, y_s = X_s.to(device), y_s.to(device)
X_t = X_t.to(device)

ptrblck · April 17, 2022, 8:49am

Check if you are creating new tensors in the forward method, which might be created on the CPU. If that’s not the case and you cannot isolate the issue, could you post a minimal, executable code snippet to reproduce the issue, please?

HassanAli · April 17, 2022, 10:35am

Hi @ptrblck . Thanks for your reply. I am trying to run this code. This is code for unsupervised domain adaptation using GRL.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

class DACNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.feature_extractor = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=5),
            nn.BatchNorm2d(64), nn.MaxPool2d(2),
            nn.ReLU(True),
            nn.Conv2d(64, 50, kernel_size=5),
            nn.BatchNorm2d(50), nn.Dropout2d(), nn.MaxPool2d(2),
            nn.ReLU(True),
        )
        self.class_classifier = nn.Sequential(
            nn.Linear(50*72*72, 100), nn.BatchNorm1d(100), nn.Dropout2d(),
            nn.ReLU(True),
            nn.Linear(100, 100), nn.BatchNorm1d(100),
            nn.ReLU(True),
            nn.Linear(100, 2),
            nn.LogSoftmax(dim=1),
        )
        self.domain_classifier = nn.Sequential(
            nn.Linear(50*72*72, 100), nn.BatchNorm1d(100),
            nn.ReLU(True),
            nn.Linear(100, 2),
            nn.LogSoftmax(dim=1),
        )
        

    def forward(self, x, grl_lambda=1.0):
        x = x.expand(x.data.shape[0], 3, 300, 300)
        features = self.feature_extractor(x)
        features = features.view(x.size(0), -1)
        reverse_features = GradientReversalFn.apply(features, grl_lambda)
        class_pred = self.class_classifier(features)
        domain_pred = self.domain_classifier(reverse_features)
        return class_pred, domain_pred
   
model = DACNN()
model.to(device)

And I am sending inputs to GPU using this code:

for batch_idx in range(max_batches):
        optimizer.zero_grad()
        p = float(batch_idx+epoch_idx * max_batches) / (n_epochs * max_batches)
        grl_lambda = 2. / (1. + np.exp(-10 * p)) - 1
        X_s, y_s = next(dl_source_iter)
        X_s, y_s = X_s.to(device), y_s.to(device)
        y_s_domain = torch.zeros(BATCH_SIZE, dtype=torch.long)

        X_t, _ = next(dl_target_iter) 
        X_t = X_t.to(device)
        y_t_domain = torch.ones(len(X_t), dtype=torch.long)

ptrblck · April 18, 2022, 6:25am

I cannot execute the code as parts are missing (e.g. GradientReversalFn as well as the input shapes).
However, based on your code snippet you are neither passing y_s_domain nor y_t_domain to the device, which might create the issue.

HassanAli · April 18, 2022, 9:49am

The code for Gradient reversal layer is:

from torch.autograd import Function

class GradientReversalFn(Function):
    @staticmethod
    def forward(ctx, x, alpha):
        ctx.alpha = alpha
        return x.view_as(x)

    @staticmethod
    def backward(ctx, grad_output):
        output=-ctx.alpha*grad_output
        return output, None

Input shapes are:

source domain: torch.Size([16, 3, 300, 300]) torch.Size([16])
target domain: torch.Size([16, 3, 300, 300]) torch.Size([16])

HassanAli · April 18, 2022, 9:52am

Thanks. That was the issue. My code started working after I moved y_s_domain and y_t_domain to the cuda.