How do we pack the data as input for enumerate?

Skybender153 · November 28, 2020, 8:33pm

I have a cifar-10 input data of size x_train : (50000,3072), y_train : (50000,). I wanted to use the x_train, y_train and pack them into the trainloader function which uses a batch size of 100, so that when I call as follows:

for batch_idx, (inputs, targets) in enumerate(trainloader):
print(inputs.shape)
print(targets.shape)

Output:
(100,32,32,3)
(100,)

I wasted a lot of time in doing so, Can some one help me with this?

pchandrasekaran · November 29, 2020, 7:45pm

You could write a basic custom dataset and use that with a dataloader.

class DS(torch.utils.data.Dataset):
    def __init__(this, X=None, y=None, mode="train"):
        this.mode = mode
        this.X = X #Maybe do reshaping here
        if mode == "train":
            this.y = y

    def __len__(this):
        return this.X.shape[0]

    def __getitem__(this, idx):
        if this.mode == "train":
            return torch.FloatTensor(this.X[idx]), torch.LongTensor(this.y[idx]) #or torch.FloatTensor(this.y[idx]) depending on use case
        else:
            return torch.FloatTensor(this.X[idx])

tr_data_setup = DS(X_train, y_train.reshape(-1,1)) #
trainloader = torch.data.utils.DataLoader(tr_data_setup, batch_size=100, ......)

You could also expand this to perform augmentations on the image if necessary.

Skybender153 · November 30, 2020, 2:28pm

Thank you that solved it.