Pytorch 0.4.0 nn.Module.to(device) not work

I got a RuntimeError: Expected object of type torch.DoubleTensor but found type torch.cuda.FloatTensor for argument #2 ‘weight’ when i try to train a DCGAN. It remind me that my input`s type is torch.cuda.FloatTensor, but my model only accept torch.DoubleTensor data, they does not match. The problem is i have set my model to specified GPU.
main.py

    ...
    cudnn.benchmark = True
    device = torch.device('cuda:3')
    G = Generator().to(device=device)
    D = Discriminator().to(device=device)
    ...

model.py

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.step = nn.Sequential(
            nn.Conv2d(1, 16, 3, padding=1),
            nn.BatchNorm2d(16, eps=0.001),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, padding=1),
            nn.BatchNorm2d(32, eps=0.002),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3, padding=1),
            nn.BatchNorm2d(64, eps=0.003),
            nn.ReLU(), 
            nn.Conv2d(64, 32, 3, padding=1),
            nn.BatchNorm2d(32, eps=0.003),
            nn.ReLU(), 
            nn.Conv2d(32, 16, 3, padding=1),
            nn.BatchNorm2d(16, eps=0.002),
            nn.ReLU(),
            nn.Conv2d(16, 1, 3, padding=1),
            nn.Tanh()
        )

    def forward(self, x):
        return self.step(x)

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.convs = nn.Sequential(
            nn.Conv2d(1, 16, 3, padding=1),
            nn.BatchNorm2d(16, eps=0.001),
            nn.LeakyReLU(),
            nn.Conv2d(16, 16, 3, padding=1),
            nn.BatchNorm2d(16, eps=0.001),
            nn.LeakyReLU(),

            nn.MaxPool2d(2, 2),

            nn.Conv2d(16, 32, 3, padding=1),
            nn.BatchNorm2d(32, eps=0.001),
            nn.LeakyReLU(),
            nn.Conv2d(32, 32, 3, padding=1),
            nn.BatchNorm2d(32, eps=0.001),
            nn.LeakyReLU(),

            nn.MaxPool2d(2, 2),

            nn.Conv2d(32, 64, 3, padding=1),
            nn.BatchNorm2d(64, eps=0.001),
            nn.LeakyReLU()
        )
        self.fcn = nn.Sequential(
            nn.Linear(64 * 7 * 7, 512),
            nn.LeakyReLU(),
            nn.Dropout(p=0.4),
            nn.Linear(512, 32),
            nn.LeakyReLU(),
            nn.Dropout(p=0.4),
            nn.Linear(32, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        x = self.convs(x)
        x = x.view(x.size(0), -1)
        x = self.fcn(x)
        return x

nn.Module.to(device) just does not work for me. Does i make some mistake or is there some way to do this in pytorch 0.4.0?

maybe you don’t have cuda 3

I`m pretty sure i have…

Could you print torch.cuda.device_count()?
Also, could you run the following code, so we can narrow down the issue:

model = nn.Sequential(nn.Linear(10, 1)).cuda('cuda:3')
print(model[0].weight.type())

Ok… the result looks like this:

>>> import torch
>>> import torch.nn as nn
>>> torch.cuda.device_count()
4
>>> model = nn.Sequential(nn.Linear(10,1)).cuda('cuda:3')
>>> print(model[0].weight.type())
torch.cuda.FloatTensor

Is it maybe a documentation error and it meant to say the opposite, like your model correctly wants a torch.cuda.FloatTensor but got a torch.DoubleTensor from the training data? It seems that cuda is at least partly working, either for your model or your training data. Maybe you forgot to also cast your input data and this is a typo in the error message? (A long shot, but maybe worth trying if you haven’t yet)

E.g., in general, it would be sth like this

D = Discriminator().to(device=device)    

for batch_idx, (features, targets) in enumerate(train_loader):
    
    features = features.to(device)
    targets = targets.to(device)
    logits, probas = D(features)

EDIT:

Expected object of type torch.DoubleTensor but found type torch.cuda.FloatTensor for argument #2 ‘weight’ when i try to train a DCGAN. It remind me that my input`s type is torch.cuda.FloatTensor,

Oh, wait, couldn’t be argument #1 the input data and like you show in your later comment,

>>> print(model[0].weight.type())
torch.cuda.FloatTensor

seems to be correctly using cuda, so that argument #1, the input data would be the input data the weight gets matrix-multiplied with, which is probably a torch.DoubleTensor (in case you haven’t used to(device) on that one as well)

yeah, i have cast my input to cuda
main.py

...
for epoch in range(num_epochs):
        for t, x in enumerate(loader):
            optimizerD.zero_grad()
            optimizerG.zero_grad()
            x.requires_grad_().to(device)
            noise_size = x.shape[0]
            noise = torch.randn(noise_size, 1, 28, 28).requires_grad_().to(device)
...

and when i do this:
main.py

...
device = torch.device('cuda:3')
G = Generator().to(device)
D = Discriminator().to(device)
for param in G.parameters():
    print(type(param.data))
...

I always get ‘<class ‘torch.Tensor’>’, should it be torch.cuda.*?

Is it maybe a documentation error and it meant to say the opposite, like your model correctly wants a torch.cuda.FloatTensor but got a torch.DoubleTensor from the training data.

This make sense, i have the same feeling, the document may be wrong.

As far as I know, if both the input data and the model parameters must always be the same type in order to be compatible.

It’s a bit of a gotcha, but type(data) would always return torch.Tensor :slight_smile: . You would need to use data.type() to really see what’s going on, i.e., to show whether it’s torch.cuda.FloatTensor or torch.FloatTensor etc.

Could you maybe try again checking your input data types before they go into the model?

EDIT:

I don’t know if x.requires_grad_().to(device) is sufficient. Maybe try

x.requires_grad_()
x = x.to(device)

As you said, I double checked data type of both model and input. The model need torch.cuda.FloatTensor, but the input is torch.cuda.DoubleTensor. And you are right, x.requires_grad_().to(device) is not work, x.to(device) is`t a inplace operation. Thanks for your help!

1 Like

Oh nice, so the problem is basically solved then if you do a

x = x.float().to(device)

?

Btw I think I was initially wrong when I said that it was a typo in the error message, I think it’s indeed correct regarding what’s called “argument #1” and “argument #2”, but yeah, it can be easily confusing

1 Like

yeah, It`s worked! And the error message is very confusing, we always feel a model should need some input rather than input need a model. This is a little counterintuitive. Thanks again for your help!

1 Like