CUDA out of memory when I store values on CPU

Hi all

I’m new to this, but I really can’t make sense of this. I’m trying to run inference on a set on images (on GPU), and I want to aggregate the results for later processing. Even though I moved the results to CPU, appending them to a list seems to result in CUDA out of memory. Whats going on here?

‘’’
batch_size = 1
dataset = utils.ImageDataset(PATH_TO_DATA, utils.get_transform(train=False))
data_loader = torch.utils.data.DataLoader(
dataset, batch_size=batch_size, shuffle=False, num_workers=2)
reader = utils.DatasetReader(data_loader)

Set up model

num_classes = 2
device = torch.device(‘cuda’) if torch.cuda.is_available() else torch.device(‘cpu’)
cpu_device = torch.device(“cpu”)

model = utils.get_instance_BB_model(num_classes)
model.to(device)
model.load_state_dict(torch.load(PATH_TO_MODEL))
model.eval()
torch.no_grad()
torch.set_num_threads(1)

Infer on images

all_out = []
for images, paths in reader.serve_images():
images = list(image.to(device) for image in images) ## Move images to GPU
outputs = model(images) ## This is the inference
outputs_cpu = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs] ## Move results back to CPU

## If I include this line, I get RuntimeError: CUDA out of memory
all_out.append(outputs_cpu)  

‘’’

Hmm, this is weird - the code seems fine to me. Have you tried debugging it, to see what exactly and when is moved to gpu and back to cpu?