Why cannot I call dataloader or model object twice?

Hongnan_G · November 25, 2021, 9:17am

I have been dabbling in PyTorch for a while and noticed a weird behaviour, likely because of seeding/instantiating issue.

    # Model, cost function and optimizer instancing
    model = models.CustomNeuralNet().to(device)
    if is_forward_pass:
        # Forward Sanity Check
        _forward_X, _forward_y = models.forward_pass(
            loader=train_loader, model=model
        )

    my_trainer: trainer.Trainer = trainer.Trainer(
        params=TRAIN_PARAMS,
        model=model,
        device=device,
        wandb_run=wandb_run,
    )

    curr_fold_best_checkpoint = my_trainer.fit(
        train_loader, valid_loader, fold
    )

I won’t be able to paste the whole code pipeline as it is pretty long, but it would be good to know why this is the case.

For example, if I set is_forward_pass to False, then my trainer will just start training, and say my loss is 0.1 for epoch 1. As a cautious move, I will always check if this particular loss changes when I modified the function (I value reproducibility a lot, though it is a black box to me). So when I switched on is_forward_pass to True, then my loss changed. Once I turned it off, the loss is the same value so I am sure it is this flag. My hunch is somehow the DataLoader or the Model is being called again? Has anyone faced this issue?

ptrblck · November 25, 2021, 11:38pm

I don’t know which higher-level API you are using, but if I understand your issue correctly you are concerned about the determinism between two different runs using is_forward_pass=True and =False.
If so, I would guess that the pseudorandom number generator is called additionally in the if_forward_pass=True case and you thus won’t get the same values.
This could happen if you are using random transformations inside the DataLoader, shuffling, etc. or if the model itself is using random values e.g. in dropout layers and is not in evaluation mode.

Hongnan_G · November 26, 2021, 2:01am

I think you are right, I am using a wrapper I coded up for my projects, haha. So I was just wondering if this is expected behaviour.