For gan training peoples seems to do it like this:
trainset = GanLoader(imgs)
sampler = torch.utils.data.RandomSampler(trainset, replacement=True, num_samples=config['num_samples'])
train_loader = torch.utils.data.DataLoader(trainset, num_workers=config['num_workers'],
For example, I saved training on step 10k and exited program, then resuming train loop, can I set 10k step in dataloader so it would give the exact same sample as it would without breaking execution?
No, you most likely wouldn’t get the same
DataLoader state if you haven’t used e.g. epoch seeds before.
In your current code snippet you are using a
RandomSampler without passing a
generator, so a new one will be initialized as seen here.
Well, this seed should depend on global seed, can I skip first n steps in generator?
Maybe I should save the generator/dataloader at the checkpoint?
If you’ve seeded your code before the beginning of the training, this might work.
If not, your new process would use a new random seed and I don’t think your approach would work.
So, I tested for a bit
from torch.utils.data import DataLoader, Dataset
def __init__(self, num=10):
self.data = range(10)
def __getitem__(self, index):
trainset = DSet()
sampler = torch.utils.data.RandomSampler(trainset, replacement=True, num_samples=20)
train_loader = DataLoader(trainset, num_workers=0, sampler=sampler,
for i in train_loader:
Manual seed before calling dataloader make it reproducible, but how can I resume training from the exact step?
Using continue inside loop would make it read all data from disc.