I have large images that I have chopped them into 512x512 patches. I have the following folders for finetuning purposes:
Number of actual larger images and number of 512x512 patches are shown in each folder:
-fold 1
--train
---pos_label 42, 32882
---neg_label 225, 134889
--val
---pos_label 14, 8799
---neg_label 75, 42553
--test
---pos_label 14, 11051
---neg_label 74, 45230
When I print the length of dataloaders, e.g., train, I get 328 which is inline with: (134889+32882)/512=327.67
. However, my problem is, after I train this model, I want to access each filename as well its prediction but I am not sure how exactly I do that. I need to do offline majority voting on the test labels for all the patches that belong to each bigger image. The reason is during training, I have use the bigger image label as the same label for all of these patches.
Could you please guide me how to access the filename along with predictions and labels?
I think I need something like sample_fnames, label = dataloaders_dict[‘test’].dataset.samples[i] however this doesn’t print 45230+11051 = 56281
filenames and also the other main problem is that the test_outputs = saved_model_ft(test_inputs)
only runs on batch not on single file. How do you suggest me to get all the filenames in test along with their predictions from a saved trained model?
with torch.no_grad():
test_running_loss = 0.0
test_running_corrects = 0
for i, (inputs, labels) in enumerate(dataloaders_dict['test']):
print('i: ', i)
sample_fnames, label = dataloaders_dict['test'].dataset.samples[i]
print(len(sample_fnames))
print(sample_fnames)
patch_name = sample_fname.split('/')[-1]
large_image_name = patch_name.split('_')[0]
print('file name {} and label {}'.format(sample_fname, label))
test_inputs = inputs.to(device)
test_labels = labels.to(device)
test_outputs = saved_model_ft(test_inputs)
_, test_preds = torch.max(test_outputs, 1)
test_running_corrects += torch.sum(test_preds == test_labels.data)
test_acc = test_running_corrects / len(dataloaders_dict['test'].dataset)
Here’s the dataloader:
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val', 'test']}
# Create training and validation dataloaders
print('batch size: ', batch_size)
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val', 'test']}
P.S.: I do have access to all of the filenames and their labels (however I am challenged since I want to get their corresponding predictions by the trained model).
sample_fnames_labels = dataloaders_dict['test'].dataset.samples