Hi there,
I’m training a model for a style transfer app.
I’ve used this training code for 2 years now and it’s perfectly working with a GTX1080Ti 12 Go but unfortunatly it does not work on a RTX3090.
Here is the error
File “train-monet_giverny.py”, line 198, in
train()
File “train-monet_giverny.py”, line 112, in train
content_features = VGG(content_batch.add(imagenet_neg_mean))
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “C:\Users\smartTour\Documents\Train-Relook\vgg.py”, line 46, in forward
x = layer(x)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\conv.py”, line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\conv.py”, line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 198.00 MiB (GPU 0; 24.00 GiB total capacity; 3.82 GiB already allocated; 17.67 GiB free; 24.00 GiB allowed; 4.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The batch size is 1 and the image size is 800x600. I manage to train on the GTX1080Ti with bigger image size.
The curious thing is that it uses 8 Go to train on the GTX1080Ti but don’t want to allocate more than 4 Gib on the RTX3090.
I’ve tried many version of python or cuda but it changes nothing. I’m using the last version for the driver.
Is there a way to “force” an allocation of 10 Gib ?
Any ideas ?
“CUDA out of memory,” suggests that your GPU is running out of memory during training. This could be due to the RTX 3090’s architecture differences or the way memory is being managed. You can try the following You can try this
Set the environment variable PYTORCH_CUDA_ALLOC_CONF to control the memory allocation behavior in PyTorch. To allocate more memory, you can set the max_split_size_mb configuration option:
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb=4096
This line should be placed at the beginning of your script, before importing PyTorch. It sets the maximum split size to 4096 MB (4 GB). You can increase this value as needed.
Thanks AbdusalamBande,
I’ve tried 4096,1024,2048, 8192, … and I always get the same error
Traceback (most recent call last):
File “train-monet_giverny.py”, line 200, in
train()
File “train-monet_giverny.py”, line 117, in train
generated_features = VGG(generated_batch.add(imagenet_neg_mean))
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “C:\Users\smartTour\Documents\Train-Relook\vgg.py”, line 46, in forward
x = layer(x)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\conv.py”, line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File “C:\Users\smartTour\tableaux\lib\site-packages\torch\nn\modules\conv.py”, line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 24.00 GiB total capacity; 3.99 GiB already allocated; 17.27 GiB free; 4.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
It seems you are limiting the device memory via torch.cuda.set_per_process_memory_fraction since the allowed stats is shown. It points to 24GB, but could you remove it nevertheless?
I’m not using torch.cuda.set_per_process_memory_fraction
Your code does not work. Always the same error : memory limited at 4Gib… Here is the error :
0.0
4096.0
Traceback (most recent call last):
File “test_memory.py”, line 10, in
x = torch.randn(2 * 1024**3, device=“cuda”)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0; 24.00 GiB total capacity; 4.00 GiB already allocated; 18.77 GiB free; 4.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Your code is using the set_per_process_memory_fraction since the allowed keyword wouldn’t be shown otherwise:
x = torch.randn(1024**4, device="cuda")
# OutOfMemoryError: CUDA out of memory. Tried to allocate 4096.00 GiB (GPU 0; 23.69 GiB total capacity; 0 bytes already allocated; 22.01 GiB free; 0 bytes reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
torch.cuda.set_per_process_memory_fraction(1.0)
x = torch.randn(1024**4, device="cuda")
# OutOfMemoryError: CUDA out of memory. Tried to allocate 4096.00 GiB (GPU 0; 23.69 GiB total capacity; 0 bytes already allocated; 22.00 GiB free; 23.69 GiB allowed; 0 bytes reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Of course, you’re right
I have tested to put a ‘set_per_process_memory_fraction’ two days ago but since it was not working I had it commented.
I trash the contentof the pycache folder and the ‘allowed’ keyword disappear but not the error
This is really weird. Does it mean you have set it to a specific limit in the past, saw the issue, removed it, and were still unable to allocate more than 4GB?
After removing the pycache (could you describe what exactly you have deleted) you are now able to allocate more than 4GB?
Actually Ihave tried to play with this limit but I have placed it after some torch settings and it was not working. I think this settings was to late in my code. its was after
Nevertheless my training code has the OOM error without it. It was unable to allocate more than 4Gib. I have tried the torch.cuda.set_per_process_memory_fraction, but not in the right place and I gave up with that and have commented it. But when you write of that this morning I try to put that just after the import as the first setting and it works.
I never met this error with the GTX1080Ti. No need to set the torch.cuda.set_per_process_memory_fraction with this card.
In my code I import three python files with classes and functions
import vgg
import experimental
import utils
I have deleted the three files related experimental.cpython-38.pyc, vgg.cpython-38.pyc and utils.cpython-38.pyc
Actually my code is crashing again for an other reason…I will investigate
OK, let me know if you are still seeing issues disallowing you to allocate more than 4GB of memory, as this setting should not be “sticky” between runs and I would consider it a bug.
I was also experimenting with it in my setup, but wasn’t able to reproduce the issue.
As I said in a previous post I had an other error related to a memory cpuallocator error. When I search about this error I found that increasing the Pagefile size can solve this error.
And I discover that the pagefile size was set to 0 on my computer. Don’t remenber why I have done that but setting it automatically solved my memory cpuAllocator error.
And Today I also try to run my code without the
And it works now. No 4GiB limitations. no OOM error
It seems that I have to have a pagefile size more than 0 to avoid this error on the Vram. I don’t know if it’s already a warning for the users of pytorch but it could be…