dataset_a : 60,000 images
dataset_b: 100,000 images
If I use drop_last=True
in dataloader, Can I use whole data of dataset_b ?
for step, (data_a, data_b) in enumerate(zip(data_loader_a, data_loader_b)):
...
dataset_a : 60,000 images
dataset_b: 100,000 images
If I use drop_last=True
in dataloader, Can I use whole data of dataset_b ?
for step, (data_a, data_b) in enumerate(zip(data_loader_a, data_loader_b)):
...
Hello,
The drop_last=True
parameter ignores the last batch (when the number of examples in your dataset is not divisible by your batch_size
) while drop_last=False
will make the last batch smaller than your batch_size
(see docs). This is not related to your issue of seeing or not the whole dataset_b
.
In your case, you will not use the whole dataset_b
because your for
loop will only iterate over the smallest dataloader in your zip
function i.e. dataloader_a
. In other words, you will have n
iterations (step)
where n = 60000 / batch_size
, which means that 40000
examples of dataset_b
will not be seen.
Hope it clarifies a bit!