DataLoader overrides tensor type?

benr · November 13, 2017, 8:38am

Hi all,
I am just starting using PyTorch, and I am running into the following issue (which ultimately results in a “TypeError: torch.addmm received an invalid combination of arguments” when I run it with a network).

The problem is that while I can set the data type to float, when transforming a numpy array into torch, it seems that this type gets reset to double by the dataloader for the minibatches (i.e. mini-batch types do not correspond to original types - see example below). What is the correct way, if I want the types to be the same for mini-batch data and labels?

import sklearn.datasets
import torch.utils.data
boston = sklearn.datasets.load_boston()
x = torch.from_numpy(boston.data).float()
y = torch.from_numpy(boston.target).float()
print(type(x))
print(type(y))
dataset = torch.utils.data.TensorDataset(x,y)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=5, shuffle=True)
for x_mini, y_mini in dataloader:
    print(type(x_mini))
    print(type(y_mini))
    break

This outputs:

    <class 'torch.FloatTensor'>
    <class 'torch.FloatTensor'>
    <class 'torch.FloatTensor'>
    <class 'torch.DoubleTensor'>

(I would expect 4x the same output <class 'torch.FloatTensor'>)

ptrblck · November 14, 2017, 2:46pm

Try to reshape your target:

y = torch.from_numpy(boston.target.reshape(-1, 1)).float()

Because your target is 1-dimensional, Pytorch casts the elements to DoubleTensors.