[Resolved] RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float


My code wok fine on Colabolatory with CUDA.
Now I try to do on local environment.
My env is;

CUDA: 10.1
cuDNN: 7

and today I installed PyTorch follows instruction of pip3.
After the setting up I meet an error;

RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

Acutually my code set device(“cuda”), I want it run on NVIDIA GPU.
I can not understand reason why script try to run on CPU, while Colab has no error (indeed run on cpu).
In addition, I replaced operators with torch’s operators, but still I get the error.
Could you tell me reason and its solution?

Can you share the concerned part of the code with us?


Thank you for your replying.
Error message is;

Traceback (most recent call last):
  File "QNet.py", line 203, in <module>
    out = model()
  File "/home/syouyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "QNet.py", line 164, in forward
  File "QNet.py", line 70, in fw_prop
  File "QNet.py", line 60, in q_layer
  File "QNet.py", line 53, in q_sel
    self.sel[index] = torch.sigmoid(torch.add(torch.mul(self.w_x[index], self.x[index]), torch.mul(self.w_h[index], self.h[index])))
RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

Tensors are defined as;

class Model(nn.Module):
    def __init__(self):
        self.w_x   = [torch.randn((NUM_INPUT), requires_grad=True) for _ in range(NUM_HIDDEN)]
        self.w_h   = [torch.randn((NUM_INPUT), requires_grad=True) for _ in range(NUM_HIDDEN)]
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

When you are converting your model with model.cuda(), the Tensor's you have will not be converted: only children Modules will be automatically converted.

You will have to manually specify the device of your tensors in the creation:

class Model(nn.Module):
    def __init__(self, device):
        self.w_x   = [torch.randn((NUM_INPUT), requires_grad=True, device=device) for _ in range(NUM_HIDDEN)]
        self.w_h   = [torch.randn((NUM_INPUT), requires_grad=True, device=device) for _ in range(NUM_HIDDEN)]
        self.sel   = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT, device=device) for _ in range(NUM_HIDDEN)]

so you pass it to your model as:

model = Model(device='cuda:0')

@spanev -san,

I do as follows;

device = device_("cuda" if cuda.is_available() else "cpu")
model = Model().to(device)

And the Model does not have argument of the device.

What is device_ here?

You cannot change the model __init__ and add a device argument? As done here:


Also about the Colab you are running on, the default Tensor device might have been set with:


but I wouldn’t recommend relying on this.

It is;

from torch import device as device_
And the Model does not have argument of the device.

I mean is when having the device as an argument, how using it in the init.

Since you are initializing the tensors with requires_grad=True, you should wrap them into nn.Parameter, so that they will be properly registered in the state_dict and will be automatically pushed to the device, if you call model.to(device).

Also, since you are storing these parameters in a list, use ParameterList, as a plain Python list won’t register the parameters properly.

This should work:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.sel   = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.x     = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.h     = nn.ParameterList([nn.Parameter(torch.zeros(NUM_INPUT)) for _ in range(NUM_HIDDEN)])

model = Model()

@ptrblck -san,

I updated code with your advices, and I got other error;

  File "QNet.py", line 62, in q_sel
    self.sel[index] = torch.sigmoid(torch.add(torch.mul(self.w_x[index], self.x[index]), torch.mul(self.w_h[index], self.h[index])))
RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

Does this mean that my setting on local PC is incorrect ?

I’m not sure, what went wrong, but try to check the device of all parameters after calling model.to('cuda'):

for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

This should print, if a parameter is not on the GPU.

1 Like

Didn’t print parameters, so those are probably on GPU card.

Ah, simply


also not worked. So, before the run the model, elaboration of function “q_sel” made the error, indeed.

I’m not sure, if I understand the sentence correctly, but is it working now?

Not work.

I use jupyter notebook. Running cell of class Model(), it calls the “q_sel” function of;

File "QNet.py", line 63, in q_sel
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

Then makes the error.

If you initialize the model, this function is called?
Could you post a reproducible code snippet so that we can have a look?

Not reproducible;

#!/usr/bin/env python
# coding: utf-8

from torchvision import datasets, transforms, models
import torch 
from torch import nn, optim, utils, device as device_, cuda
import numpy as np

# Architecture Hyper-Parameters
NUM_INPUT      = 28
TIME_STEPS     = 28
NUM_CLASS      = 10
NUM_HIDDEN     = 28

def q_sel (self):
  for index in range(NUM_HIDDEN):
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

def q_layer (self):

def fw_prop (self):

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        # Gate-Weight
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        # Gate-Selector
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        # Input Vector
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

        # Output Vector
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

    def forward(self):

model = Model()

for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

output is

<generator object Module.parameters at 0x7fedc4fd9b48>

So properly works.

I identified where made the error with print() on original code.
Calling model() after model.train(), it also called function of

def q_sel (self):
  for index in range(NUM_HIDDEN):
    self.sel[index] = torch.sigmoid(self.x[index] * self.w_x[index] + self.h[index] * self.w_h[index])

After first print("2") getting error. So I think the code could not send data of self.x, self.w_x, self.h, and or self.w_h.

@ptrblck -san
I did it;

#!/usr/bin/env python
# coding: utf-8

from torchvision import datasets, transforms, models
import torch 
from torch import nn, optim, utils, device as device_, cuda
import numpy as np

# Architecture Hyper-Parameters
NUM_INPUT      = 28
TIME_STEPS     = 28
NUM_CLASS      = 10
NUM_HIDDEN     = 28

BATCH_SIZE     = 1024
EPOCH          = 64

def one_hot_embedding (y, length):
  out = torch.zeros(length)
  out[y] = 1.0

def q_sel (self):
  for index in range(NUM_HIDDEN):
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])

def q_layer (self):

def fw_prop (self):

dataset_train = datasets.MNIST(

dataloader_train = utils.data.DataLoader(dataset_train,

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        # Gate-Weight
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        # Gate-Selector
        self.sel   = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]
        # Input Vector
        self.x     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

        # Output Vector
        self.h     = [torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]

    def forward(self):

model = Model()

for name, param in model.named_parameters():
    if param.device.type != 'cuda':
        print('param {}, not on GPU'.format(name))

for epoch in range(EPOCH):
  for x, t in dataloader_train:
    y = one_hot_embedding(t, NUM_CLASS)
    for time in range(TIME_STEPS):
      model.x[0] = x[0][0][time] 
      model.h[0] = torch.zeros(NUM_INPUT)
      out = model()

makes error of;

  File "test.py", line 24, in q_sel
    self.sel[index] = torch.sigmoid(self.w_x[index] * self.x[index] + self.w_h[index] * self.h[index])
RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

So, lack of sending data makes the error, indeed.

sel, x and h are not wrapped in nn.Parameter and nn.ParameterList,
Have a look at my code snippet.

Code does;

      model.x[0] = x[0][0][time] 
      model.h[0] = torch.zeros(NUM_INPUT)

self.sel , self.x , self.h are not parameters.
So, I think wrapping as parameter does not allow substitution.
It makes an error of;

in register_parameter
    .format(torch.typename(param), name))
TypeError: cannot assign 'torch.FloatTensor' object to parameter '0' (torch.nn.Parameter or None required)

I triied to do;

        self.sel   = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]
        self.x     = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]
        self.h     = [torch.zeros(NUM_INPUT, device='cuda') for _ in range(NUM_HIDDEN)]

In training loop:

      model.x[0] = torch.cuda.FloatTensor(x_)
      model.h[0] = torch.cuda.FloatTensor(torch.zeros(NUM_INPUT))

Makes segmentation fault

Thanks for the information. I’ve missed that these tensors should not require gradients.
In that case, you could register them as buffers using:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.w_x   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.w_h   = nn.ParameterList([nn.Parameter(torch.randn(NUM_INPUT)) for _ in range(NUM_HIDDEN)])
        self.register_buffer('sel', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))
        self.register_buffer('x', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))
        self.register_buffer('h', torch.stack([torch.zeros(NUM_INPUT) for _ in range(NUM_HIDDEN)]))

I do it and makes;

model.x[0] = torch.cuda.FloatTensor(s)

Then segmentation fault is occurred on the line.

1 Like