RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

I am getting the above error (mentioned in title). I was wondering what this issue even means. My guess is that I am using the cpu and gpu together, which is causing errors. However, how can the problem only lie in the weight data? Does this mean I have to convert all of my tensors to tensor.cuda()?


It seems that your input tensors are already on GPU, but your model isn’t. Did push your model to GPU as well?
If you did not, this might be causing the error. If you did so, you should probably show the code since we can only have random guesses without the code.


editing… i need to format this correctl sorry

Here is the code. I feel like the error might be caused when I say layers+=[RowLSTM()], since this has not been pushed to GPU yet? Sorry for the length but the error could be anywhere so here it all is.

import torch.nn.init as init
import torch
__all__ = [
  'VGG', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn',
  'vgg19_bn', 'vgg19',

class VGG(nn.Module):
  VGG model 
  def __init__(self, features): # features represents the layers array
      super(VGG, self).__init__()
      self.features = features
      self.classifier = nn.Sequential(
          nn.Linear(512, 512),
          nn.Linear(512, 10),
       # Initialize weights
      for m in self.modules():
          if isinstance(m, nn.Conv2d):
              n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
    , math.sqrt(2. / n))

  def forward(self, x): # x is the image, we run x through the layers
      x = self.features(x) # runs through all features, where each feature is a function
      x = x.view(x.size(0), -1) 
      # after running through features, does sequential steps to finally classify
      x = self.classifier(x)
      # print(x)
      return x

def make_layers(cfg, batch_norm=False):
 # print("Making layers!")
  layers = []
  in_channels = 3
  for v in cfg:
      if v == 'M':
          layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
          conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
          if batch_norm:
              layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
              layers += [conv2d, nn.ReLU(inplace=True)]
          in_channels = v

  return nn.Sequential(*layers)

class RLSTM(nn.Module):
  def __init__(self):

  def forward(self, image):
      print("going in rowlstm")
      global current
      global _layer
      global isgates
      size = image.size()
      b = size[0]
      indvs = list(image.split(1,0)) # split up the batch into individual images
      tensor_array = []
      for i in range(b):
          current = 0
          _layer = []
          isgates = []

      trans =,0)
      return trans.cuda() # trying to make floattensor error go away 
  def RowLSTM(self, image): 
      global current
      global _layer
      global isgates

      # input-to-state (K_is * x_i) : 3x1 convolution. generate 4h x n x n tensor. 4hxnxn tensor contains all i -> s info

  # the input to state convolution should only be computed one time 
      if current==0:
          n = image.size()[2]
          input_to_state = torch.nn.Conv2d(ch,4*ch,kernel_size=(1,3),padding=(0,1))
          isgates = self.splitIS(input_to_state(image)) # convolve, then split into gates (4 per row)
          # now have dummy, learnable variables for first row

          Cell_prev = _layer[current-1] # access previous row
          hidPrev = Cell_prev.getHiddenState() 
          ch = image.size()[1] 
      #   print("about to apply conv1d")
          state_to_state = torch.nn.Conv2d(ch,4*ch,kernel_size=(1,3),padding=(0,1)) # error is here: hidPrev is an array - not a valid number of input channel
      #   print("applied conv1d") 
          ssgates = self.splitSS(state_to_state(prevHid.unsqueeze(0))) #need to unsqueeze (Ex: currently 16x5, need to make 1x16x5)
          gates = self.addGates(isgates,ssgates,current)
          # split gates
          ig, og, fg, gg = gates[0], gates[1], gates[2], gates[3] # into four, ADD SIGMOID!
          cell = RowLSTMCell(Cell_prev,ig,og,fg,gg,0,0)
      # attempting to eliminate requirement of getting size

          return self.RowLSTM(image) 
      except Exception as error:
          for cell in _layer:
          return tensor

  def splitIS(tensor): #always going to be splitting into 4 pieces, so no need to add extra parameters
      size=tensor.size() # 1 x 4h x n x n
      out_ft=size[1] # get 4h for the nxnx4h tensor
      num=size[2] # get n for the nxn image
      hh=out_ft/4 # we want to split the tensor into 4, for the gates
      tensor = torch.squeeze(tensor) # 4h x n x n

      # First, split by row: Creates n tensors of 4h x n x 1
      rows = list(tensor.split(1,2))

      for i in range(num):
          # Each row is a tensor of 4h x n x 1, split it into 4 of h x n x 1
      return inputStateGates 

  def splitSS(tensor): # 1 x 4h x n x 1, create 4 of 1 x h x n x 1 
      out_ft=size[1] # get 4h for the 1x4hxn tensor
      num=size[2] # get n for the 1xhxn row
      hh=out_ft/4 # we want to split the tensor into 4, for the gates
      tensor = tensor.squeeze(0) # 4h x n x 1
      return splitted 

  def addGates(i2s,s2s,key):
      """ these dictionaries are of form {key : [[i], [o], [f], [g]]}
          we want to add pairwise elemeents """

      # i2s is of form key: [[i], [o], [f], [g]] where each gate is hxn
      # s2s is of form [[h,n],[h,n],[h,n], [h,n]]
      gateSum = []
      for i in range(4): # always of length 4, representing the gates
          gateSum.append(torch.sigmoid(i2s[key][i] + s2s[i]))

      return gateSum
cfg = {
  'A': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
  'B': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
  'D': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
  'E': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 
        512, 512, 512, 512, 'M'],

I have faced the similar error as well, and will be thankful for any help and advice.

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

1 Like

can you post a minimal example to reproduce it? Typically something hasn’t been pushed to the correct device (usually the GPU).

You need to convert your input to cuda before feeding it to the model.
Let’s say:

model = VGG16()

for inp in dataset:
    x = inp.cuda()
    y = model(x)

need to move your model to GPU:



Thanks very much! my problem was fixed because of your idea!

I got the same problem. while later found that nn.ModuleList should be used to properly implement list of modules in pytorch, otherwise, it can’t be recognized. Hope this helps


Thanks. Fixed my problem.

i have the some issue please need your help

The error is raised if your data or model are not pushed to the device, while one of them is.
You should make sure that both are on the GPU using:'cuda')
data ='cuda')

as explained in this topic.


Thanks a lot! Worked.

Thanks for this suggestion, this was my issue.

Model.cuda() #not a good approach

is not what i will tell you to do …
you may use

device =  torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

then push the model to device & not cuda 

model = ## this will be a correct approach
### otherwise it will not be reproducable ###

Thank you so much. Great.