Take whole data from dataloader

Greetings!

In order to print the initial loss of my network (before training), I want to feed the entire data set into my network like this:

    loss_train = np.zeros(epochs+1)
    accu_train = np.zeros(epochs+1)
    
    # initial loss 
    output = model(batches_train.dataset.data)   # batches_train is dataloader
    targets = batches_train.dataset.targets
    loss_train[0] = criterion(output, targets).detach()

The problem is that this gives the original dataset without the transformations that were specified in the creation of the dataloader. For example, if I specified the transforms.ToTensor() operation for the dataloader, my code will complain that the model( ) does not accept np.arrays as input.

Is there a cheap way to fix this? I could easily just iterate over all the batches, but I was wondering whether one can do without a for-loop.

Regards!

One way would be to set the batch_size to the entire dataset length and create the transformed batch via:

input = next(iter(loader))
1 Like

Right, and then turn the batch_size back to normal for training? What is the usual way to do this? Are people just iterating over the batches first? I think knowing the initial loss is important.

edit: trying to assign a new batch size via batches_train.batch_size = runs into error

ValueError: batch_size attribute should not be set after DataLoader is initialized

Creating a new DataLoader would be cheap in case you are lazily loading the data, so you might want to create a different one with this setup.

1 Like