Getting assertion error while converting nd array to tensor

Prak_kas · October 29, 2019, 7:46pm

I have made train and validation splits of data using sklearn splits. The results of sklearn splits are of nd array type , i am converting them to tensor before building data loader , but I am getting an assertion error

from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader


x_tr = torch.tensor(x_tr, dtype=torch.long)
y_tr = torch.tensor(y_tr, dtype=torch.float32)
Train = TensorDataset(x_tr, y_tr)
Trainloader = DataLoader(Train, batch_size=128)

x_valid2 = torch.tensor(x_valid2, dtype=torch.long)
y_valid2 = torch.tensor(y_valid2, dtype=torch.float32)
valid2 = TensorDataset(x_valid2, y_valid2)
validloader2 = DataLoader(valid2, batch_size=128)

Error is as follows:
AssertionError Traceback (most recent call last)
in ()
32 x_tr = torch.tensor(x_tr, dtype=torch.long)
33 y_tr = torch.tensor(y_tr, dtype=torch.float32)
—> 34 Train = TensorDataset(x_tr, y_tr)
35 Trainloader = DataLoader(Train, batch_size=128)
36

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataset.py in __init__(self, *tensors)
    156 
    157     def __init__(self, *tensors):
--> 158         assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors)
    159         self.tensors = tensors
    160 

AssertionError:

I also tried to convert the the nd arrays to tensors using torch.numpy(), to mitigate the issue, but still this error at Tensordataset before data loader persists.
Any help is appreciated.

albanD · October 29, 2019, 8:23pm

Hi,

You need to use torch.from_numpy() to have proper conversion behavior.

The error you see is because the TensorDataset expects all inputs to contain the same number of elements. In particular, here, I expect x_valid2.size(0) != y_valid2.size(0).