Inconsistency in Network Training

I notice a wired behavior of pytorch, where the network defined after the data loader has a different loss value as compared to the case where the network is defined before the dataloader. I have fixed the random seed to the same number in both cases by adding the lines below.
np.random.seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed(0)
What can be the reason for this?

If you’ve added these lines at the beginning of the script, the order or calls to the pseudo random number generator might still be different in your two use cases.
E.g. initializing the random indices in the DataLoader will change the following call to the PRNG for the parameter initialization in your model.