How to prepare .tsv file for Pytorch/trainloader

Kasper · May 6, 2020, 7:50am

Hi - new to the forum and new to pytorch

I have a .tsv file with 120 rows and 800 columns, each row is a sample and each column a feature. I want to prepare it for a neural network. I have read in the data using panda:
dataset = pd.read_csv(“filepath”, sep=’\t’)

Next, I want to use a trainloader,
torch.utils.data.DataLoader(dataset = dataset)

But I don’t know how to tell the trainloader which are my y values (1st column) and which are my input features (the rest of the columns)

I have also been looking into the torchtext.data.TabularDataset function, but can’t get it to work either.

All inputs are much appreciated

sathvik_udupa · May 6, 2020, 8:33am

Hi,
Have you gone through the custom dataset pytorch docs?
In this website, they use a dataset-dataloader pipeline with the dataset being a custom class loading from a csv file, you can modify appropriately based on your .tsv file.

Kasper · May 6, 2020, 9:17am

Great, thanks a lot sathvik, that did the trick =)