I am using custom dataset class and then using data loader to load my data.
train_loader = torch.utils.data.DataLoader(
dataset_train,
sampler=train_sampler,
batch_size=args.batch_size_per_gpu,
num_workers=args.num_workers,
pin_memory=True,
)
print(f"Data loaded with {len(dataset_train)} train imgs.")
The length of the dataset is 85 images.
print("for training model,size of train loader", len(train_loader.dataset))
But when I iterate through the data loader, it only divides into 2 batches of 16 and 6 images (batch size of 16) so only 22 images.I don’t know what happens with the rest of the images.I am using DINO tool’s eval_linear script https://github.com/facebookresearch/dino/blob/main/eval_linear.py.
I had to modify that code for my custom dataset as well.So it looks like this:
def train(model, linear_classifier, optimizer, train_loader, epoch, n, avgpool):
linear_classifier.train()
metric_logger = utils.MetricLogger(delimiter=" ")
metric_logger.add_meter('lr', utils.SmoothedValue(window_size=1, fmt='{value:.6f}'))
header = 'Epoch: [{}]'.format(epoch)
all_labels = []
for it,sample in enumerate(metric_logger.log_every(train_loader, 10)):
#print('type of sample is ...',sample)
images = sample['image']
## extract label
index = sample['label']
all_labels.append(index)
print('label is ...',index)
# move to gpu
images = images.cuda(non_blocking=True)
## index is the target/label value
index = index.cuda(non_blocking=True)
Interestingly the code works perfectly fine with torch version 1.7.0 and torch metrics 0.8.2 but due to GPU incompatibility , I needed to upgrade to both tools and this throws weird challenge.
I even provide batch size of 85 but it only iterates through 22 images.The length of the dataset in the data loader just before iterating is 85 images, so I don’t know why the rest of the images are not trained and the batches are not divided accordingly.There should be 5 batches with 16 images and the last batch with 5 images, so total 6 batches.
I also tried changing the print_freq of metric logger but it also did not help.
Can anyone provide any suggestion?