I am creating a Mask R-CNN model to detect and mask different sections of dried plants from images. The images we are dealing with are quite large, my model trains without running out of memory, but runs out of memory on the evaluation, specifically on the
outputs = model(images) inference step. Both my training and evaluation steps are in different functions with my evaluation function having the
torch.no_grad() decorator, also batch size for both training and evaluation are 1.
I’m not sure why my model would be able to train without running out of memory but fail during evaluation.
I have generally followed the steps here, using the same structure, engine and such.
File "at025_main.py", line 307, in <module> main() File "at025_main.py", line 238, in main evaluate(model, data_loader_test, device=device) File "/home/a.kia5/.conda/envs/at025/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/lustrehome/home/a.kia5/at025/mask_rcnn_pytorch/engine.py", line 93, in evaluate outputs = model(images) File "/home/a.kia5/.conda/envs/at025/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/a.kia5/.conda/envs/at025/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py", line 99, in forward detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes) File "/home/a.kia5/.conda/envs/at025/lib/python3.7/site-packages/torchvision/models/detection/transform.py", line 233, in postprocess masks = paste_masks_in_image(masks, boxes, o_im_s) File "/home/a.kia5/.conda/envs/at025/lib/python3.7/site-packages/torchvision/models/detection/roi_heads.py", line 479, in paste_masks_in_image ret = torch.stack(res, dim=0)[:, None] RuntimeError: CUDA out of memory. Tried to allocate 6.84 GiB (GPU 0; 15.78 GiB total capacity; 7.63 GiB already allocated; 6.59 GiB free; 8.00 GiB reserved in total by PyTorch)
Any help would be appreciated.