[Solved]How to use Dataset and DataLoader to process multiple inputdata from numpy type?

Alex_Hex · July 19, 2017, 6:49am

with one input data x1 and one label y we can do in this way:

total_nb = 100

x1_np = np.random.randn(total_nb, 20)
x2_np = np.random.randn(total_nb, 30)
y_np = np.random.randn(total_nb, 10)

x1 = torch.from_numpy(x1_np)
x2 = torch.from_numpy(x2_np)
y = torch.from_numpy(y_np)

dataset = Data.TensorDataset(data_tensor=x1, target_tensor=y)
data_loader = Data.DataLoader(dataset, batch_size=10, shuffle=True)

for i, j in data_loader:
print(i.size(), j.size())

But how to do with two input data x1,x2 and one label y:

dataset = Data.TensorDataset(data_tensor=(x1, x2), target_tensor=y)

In this way is wrong…

matliu · July 19, 2017, 7:46am

Is it useful by using

np.hstack((x1_np,x2_np))

befor tranporting to tensor ?

Otherwise, we can rewrite the class to dataset.

chsasank · July 19, 2017, 8:25am

You can have a look at either torch vision source code for ImageFolder dataset or the data loading tutorial: http://pytorch.org/tutorials/beginner/data_loading_tutorial.html