[Resolved] RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

111137 · August 30, 2019, 4:27pm

Hi,

My code wok fine on Colabolatory with CUDA.
Now I try to do on local environment.
My env is;

CUDA: 10.1
cuDNN: 7

and today I installed PyTorch follows instruction of pip3.
After the setting up I meet an error;

RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

Acutually my code set device(“cuda”), I want it run on NVIDIA GPU.
I can not understand reason why script try to run on CPU, while Colab has no error (indeed run on cpu).
In addition, I replaced operators with torch’s operators, but still I get the error.
Could you tell me reason and its solution?

spanev · August 30, 2019, 5:05pm

Hi,
Can you share the concerned part of the code with us?

111137 · August 30, 2019, 5:09pm

Hi,

Thank you for your replying.
Error message is;

Traceback (most recent call last):
  File "QNet.py", line 203, in <module>
    out = model()
  File "/home/syouyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "QNet.py", line 164, in forward
    fw_prop(self)
  File "QNet.py", line 70, in fw_prop
    q_layer(self)
  File "QNet.py", line 60, in q_layer
    q_sel(self)
  File "QNet.py", line 53, in q_sel
    self.sel[index] = torch.sigmoid(torch.add(torch.mul(self.w_x[index], self.x[index]), torch.mul(self.w_h[index], self.h[index])))
RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

Tensors are defined as;

class Model(nn.Module):
    def __init__(self):
        self.w_x   = [torch.randn((NUM_INPUT), requires_grad=True) for _ in range(NUM_HIDDEN)]
        self.w_h   = [torch.randn((NUM_INPUT), requires_grad=True) for _ in range(NUM_HIDDEN)]
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

spanev · August 30, 2019, 5:53pm

When you are converting your model with model.cuda(), the Tensor's you have will not be converted: only children Modules will be automatically converted.

You will have to manually specify the device of your tensors in the creation:

class Model(nn.Module):
    def __init__(self, device):
        super().__init__()
        self.w_x   = [torch.randn((NUM_INPUT), requires_grad=True, device=device) for _ in range(NUM_HIDDEN)]
        self.w_h   = [torch.randn((NUM_INPUT), requires_grad=True, device=device) for _ in range(NUM_HIDDEN)]
        self.sel   = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]

so you pass it to your model as:

model = Model(device='cuda:0')

111137 · August 30, 2019, 6:05pm

@spanev -san,

I do as follows;

device = device_("cuda" if cuda.is_available() else "cpu")
model = Model().to(device)

And the Model does not have argument of the device.

spanev · August 30, 2019, 6:10pm

What is device_ here?

You cannot change the model __init__ and add a device argument? As done here:

?

Also about the Colab you are running on, the default Tensor device might have been set with:

torch.set_default_tensor_type(torch.cuda.FloatTensor)

but I wouldn’t recommend relying on this.

111137 · August 30, 2019, 6:26pm

It is;

from torch import device as device_

And the Model does not have argument of the device.

I mean is when having the device as an argument, how using it in the init.

ptrblck · August 30, 2019, 6:47pm

Since you are initializing the tensors with requires_grad=True, you should wrap them into nn.Parameter, so that they will be properly registered in the state_dict and will be automatically pushed to the device, if you call model.to(device).

Also, since you are storing these parameters in a list, use ParameterList, as a plain Python list won’t register the parameters properly.

This should work:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.sel   = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.x     = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.h     = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])

model = Model()
model.cuda()
print(model.parameters())

111137 · August 30, 2019, 7:12pm

@ptrblck -san,

I updated code with your advices, and I got other error;

  File "QNet.py", line 62, in q_sel
    self.sel[index] = torch.sigmoid(torch.add(torch.mul(self.w_x[index], self.x[index]), torch.mul(self.w_h[index], self.h[index])))
RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

Does this mean that my setting on local PC is incorrect ?

ptrblck · August 30, 2019, 7:15pm

I’m not sure, what went wrong, but try to check the device of all parameters after calling model.to('cuda'):

for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

This should print, if a parameter is not on the GPU.

111137 · August 30, 2019, 7:18pm

Didn’t print parameters, so those are probably on GPU card.

Ah, simply

print(model.parameters())

also not worked. So, before the run the model, elaboration of function “q_sel” made the error, indeed.

ptrblck · August 30, 2019, 7:24pm

I’m not sure, if I understand the sentence correctly, but is it working now?

111137 · August 30, 2019, 7:26pm

Not work.

I use jupyter notebook. Running cell of class Model(), it calls the “q_sel” function of;

File "QNet.py", line 63, in q_sel
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

Then makes the error.

ptrblck · August 30, 2019, 7:28pm

If you initialize the model, this function is called?
Could you post a reproducible code snippet so that we can have a look?

111137 · August 30, 2019, 8:19pm

Not reproducible;

#!/usr/bin/env python
# coding: utf-8

from torchvision import datasets, transforms, models
import torch 
from torch import nn, optim, utils, device as device_, cuda
import numpy as np

# Architecture Hyper-Parameters
NUM_INPUT      = 28
TIME_STEPS     = 28
NUM_CLASS      = 10
NUM_HIDDEN     = 28

def q_sel (self):
  for index in range(NUM_HIDDEN):
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

def q_layer (self):
  q_sel(self)

def fw_prop (self):
  q_layer(self)

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        
        # Gate-Weight
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
    
        # Gate-Selector
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        
        # Input Vector
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

        # Output Vector
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

    def forward(self):
      fw_prop(self)

model = Model()
model.cuda()

print(model.parameters())
for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

output is

<generator object Module.parameters at 0x7fedc4fd9b48>

So properly works.

I identified where made the error with print() on original code.
Calling model() after model.train(), it also called function of

def q_sel (self):
  for index in range(NUM_HIDDEN):
    print("2")
    self.sel[index] = torch.sigmoid(self.x[index] * self.w_x[index] + self.h[index] * self.w_h[index])
    print("3")

After first print("2") getting error. So I think the code could not send data of self.x, self.w_x, self.h, and or self.w_h.

111137 · August 30, 2019, 8:56pm

@ptrblck -san
I did it;

#!/usr/bin/env python
# coding: utf-8

from torchvision import datasets, transforms, models
import torch 
from torch import nn, optim, utils, device as device_, cuda
import numpy as np

# Architecture Hyper-Parameters
NUM_INPUT      = 28
TIME_STEPS     = 28
NUM_CLASS      = 10
NUM_HIDDEN     = 28

BATCH_SIZE     = 1024
EPOCH          = 64

def one_hot_embedding (y, length):
  out = torch.zeros(length)
  out[y] = 1.0

def q_sel (self):
  for index in range(NUM_HIDDEN):
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

def q_layer (self):
  q_sel(self)

def fw_prop (self):
  q_layer(self)

dataset_train = datasets.MNIST(
    '~/mnist', 
    train=True, 
    download=True, 
    transform=transforms.ToTensor())

dataloader_train = utils.data.DataLoader(dataset_train,
                                          batch_size=BATCH_SIZE,
                                          shuffle=True,
                                          num_workers=4)

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        
        # Gate-Weight
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
    
        # Gate-Selector
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        
        # Input Vector
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

        # Output Vector
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

    def forward(self):
      fw_prop(self)

model = Model()
model.cuda()

print(model.parameters())
for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

for epoch in range(EPOCH):
  for x, t in dataloader_train:
    y = one_hot_embedding(t, NUM_CLASS)
    for time in range(TIME_STEPS):
     
      model.x[0] = x[0][0][time] 
      model.h[0] = torch.zeros(NUM_INPUT)
      model.zero_grad()
      out = model()

makes error of;

  File "test.py", line 24, in q_sel
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])
RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

So, lack of sending data makes the error, indeed.

ptrblck · August 30, 2019, 9:18pm

sel, x and h are not wrapped in nn.Parameter and nn.ParameterList,
Have a look at my code snippet.

111137 · August 30, 2019, 9:56pm

Code does;

      model.x[0] = x[0][0][time] 
      model.h[0] = torch.zeros(NUM_INPUT)

self.sel , self.x , self.h are not parameters.
So, I think wrapping as parameter does not allow substitution.
It makes an error of;

in register_parameter
    .format(torch.typename(param), name))
TypeError: cannot assign 'torch.FloatTensor' object to parameter '0' (torch.nn.Parameter or None required)

I triied to do;

        self.sel   = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]

In training loop:

      model.x[0] = torch.cuda.FloatTensor(x_)
      model.h[0] = torch.cuda.FloatTensor(torch.zeros(NUM_INPUT))

Makes segmentation fault

ptrblck · August 30, 2019, 10:45pm

Thanks for the information. I’ve missed that these tensors should not require gradients.
In that case, you could register them as buffers using:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.register_buffer('sel', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))
        self.register_buffer('x', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))
        self.register_buffer('h', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))

111137 · August 31, 2019, 6:03am

I do it and makes;

model.x[0] = torch.cuda.FloatTensor(s)

Then segmentation fault is occurred on the line.