Hi I have a custom map-style dataLoader function for my application. Please excuse the indentation below.
class data(object):
def __init__(self, train):
self.train = train
<some other init>
def __len__(self):
if self.train:
return 640
else:
return 160
def __getitem__(self, batchIdx):
<do something>
return something
Now as it can be seen, in the __len__
, I have used dummy numbers. This is for my own experimentation. I am using a batch size of 32.
This is my training and validation dataloader build inside LightningModule
class
def prepare_data(self):
self.train_dset = data(
train=True
)
self.val_dset = data(
train=False
)
def _build_dataloader(self, dset, mode):
return DataLoader(
dset,
batch_size =32,
drop_last=mode == "train",
collate_fn=collate_fn,
pin_memory=True,
num_workers=8,
prefetch_factor=1
)
def train_dataloader(self):
return self._build_dataloader(self.train_dset, mode="train")
def val_dataloader(self):
return self._build_dataloader(self.val_dset, mode="val")
I am expecting 20 batches for training(return len of 640 for batch size of 32) and 5 for validation(return len of 160 for batch size of 32). But during training, it prints
Epoch 0: 100%|██████████████| 25/25
Validating: 100%|██████████████| 5/5
The validation size is correct. But I can’t understand the reason behind 25 training batches here.
Am I missing something?
Thank you!!