I’m fairly new to PyTorch and I’m trying to extract the features of my training data using a ResNet101 and save them to a dictionary. I will then intend to use this dictionary as a lookup table for another model.
However, I keep running into memory problems after only processing 8 images on a 12GB GPU machine. The for loop is:
splits = ['train2014', 'val2014', 'test2015']
for split in splits:
RES_FILE = os.path.join(D_ROOT, 'BaselineTraining', split, 'baseline_{}_cnn_features.pth'.format(split))
IMAGE_DIR = os.path.join(D_ROOT, 'Images', 'mscoco', split)
images = os.listdir(IMAGE_DIR)
images_path = [os.path.join(IMAGE_DIR, file) for file in images]
res = torch.load(RES_FILE) if os.path.exists(RES_FILE) else {}
for i in progressbar.progressbar(range(len(images))):
#Checkpoint every 2,500 images
if i % 2500 == 0 and i > 0:
torch.save(res, RES_FILE)
image_name, image = images[i], cv2.imread(images_path[i])
if image_name in res:
continue
image = process_image(image)
image = np.expand_dims(image, axis=0)
image = torch.from_numpy(image)
_, H, W, C = list(image.size())
image = image.view(1, C, H, W)
image = image.float()
image = image.cuda()
out = resnet101.forward(image)
#res[image_name] = out.cpu()
print('Successfully processed {}.'.format(split))
print('Successfully processed all images.')
Also, I have noticed that the problem only persists when I uncomment the commented line. Without it, extraction goes smoothly. What is happening here?