How to create batch training data for a matrix?

Currently I have a training set, with inputs X being N d-dimensional data, and outputs Y being a N-by-N matrix (which measures some pairwise distances among these data). I am trying to create batch training data for a neural network, and I know that for input X I can do following to generate batch inputs of size n:

batch = DataLoader(X, batch_size=n, shuffle=True, drop_last=True)

Does anyone know how to generate the batch output matrix of size n-by-n for output Y while keeping the indices for X and Y consistent?

Thank you!

You could create a Dataset and write the code to return a single sample.
Wrapping the Dataset into a DataLoader will automatically batch your data.
Have a look at the Data Loading Tutorial for more information.

Basically, you have to define how a single sample is extracted in the __getitem__ method.

1 Like