I tried to run the camvid project of the FastAI course (https://course.fast.ai/videos/?lesson=3, https://nbviewer.jupyter.org/github/fastai/course-v3/blob/master/nbs/dl1/lesson3-camvid.ipynb) on my pc.
(specs: Windows 10 Education 64 bit, processor Intel® Core™ i7-7700K CPU @ 4.20GHz, 4200 MHz,
GPU: Nvidia GeForce GTX 1080 Ti 11GB )
Its my first time running any kind of cnn/unet and for me the error message doesnt really make sense since it tries to allocate 538 MiB and it wont work altough 8.53 GiB is free.
Is it because PyTorch has only reserved 242 MiB ?
I checked the other related forum topics but all of the have “Tried to allocate X GiB and <X GiB is free”.
And batchsize reduction doesnt solve the problem for me like mentioned in the other forum topics.
Anyone has had the same problem?
What was your initial batch size and what is the current one?
Also, did you make sure that your GPU is empty and no other processes use device memory?
My inital batch size was 8 but I tried it with 4 and 2 and had the same issue for both of them.
Yes before im running the code I clear the GPU memory with torch.cuda.empty_cache()
Is it maybe because Im running it on windows?
torch.cuda.empty_cache() will only clear the PyTorch memory cache on the device.
I meant you should check via
nvidia-smi, if other processes are using the GPU.
Also, if a batch size of 1 doesn’t fit on the GPU, you might need to use
torch.utils.checkpoint to trade compute for memory.
Although it would be surprising to see a FastAI lecture code would need a bigger GPU.
Thanks for the help!
I guess there was a problem with my Anaconda version and the fastAi library together with windows.
I solved it by buying a new ssd where I installed the new ubuntu 20.04 and it worked first try with batch size of 4.