This is my data,
id | label | tweet | |
---|---|---|---|
0 | 1 | 0 | @user when a father is dysfunctional and is so selfish he drags his kids into his dysfunction. #run |
which is in text format, I have pre-processed it and then I want to fit a PyTorch LSTM model in it.
To fit the model I have to split the dataset into train and test set, and as PyTorch has a very interesting module called DataLoader to load the dataset, so we could use it. But as soon as I do this -
train_data = TensorDataset(torch.from_numpy(np.array(train_x)), torch.from_numpy(np.array(train_y)))
valid_data = TensorDataset(torch.from_numpy(np.array(valid_x)), torch.from_numpy(np.array(valid_y)))
test_data = TensorDataset(torch.from_numpy(np.array(test_x)), torch.from_numpy(np.array(test_y)))
It throws an error that
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, and uint8.
in the line of
----> 4 train_data = TensorDataset(torch.from_numpy(np.array(train_x)),torch.from_numpy(np.array(train_y)))
I have also printed the shape and type of the splitted datasets,
print('shape of training set: {}' .format(train_x.shape))
print('shape of valid set: {}' .format(valid_x.shape))
print('shape of test set: {}' .format(test_x.shape))
shape of training set: (32979,)
shape of valid set: (2910,)
shape of test set: (2910,)
print(train_x.dtype)
object
How can I solve this error? I have tried solutions like converting the object type to string or float, neither of them worked. I am not getting any solutions. Any help appriciated.