CUDA OOM message that doesn't make sense

The following message says that given 23.69Gb of total GPU memory, allocation of more than 610Mb leads to OOM. On the other hand, the log from nvidia-smi shows the real situation.

Pytorch message from a 24Gb GPU:

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 23.69 
GiB total capacity; 595.94 MiB already allocated; 2.06 MiB free; 610.00 MiB 
reserved in total by PyTorch) If reserved memory is >> allocated memory try 
setting max_split_size_mb to avoid fragmentation.  See documentation for Memory 
Management and PYTORCH_CUDA_ALLOC_CONF

A similar message from a 12Gb GPU:

OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.75 GiB total capacity; 810.93 MiB already 
allocated; 3.62 MiB free; 828.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting 
max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Logging with nvidia-smi shows that a memory issue indeed occurred:

#Date       Time        gpu     fb   bar1     sm    mem    enc    dec 
#YYYYMMDD   HH:MM:SS    Idx     MB     MB      %      %      %      % 
 20230511   12:28:15      0    447      6      0      0      0      0 
 20230511   12:28:17      0    477      6      0      0      0      0 
 20230511   12:28:19      0    507      6      0      0      0      0 
 20230511   12:28:21      0    555      6      0      0      0      0 
 20230511   12:28:23      0    625      6      0      0      0      0 
 20230511   12:28:25      0    861      6      0      0      0      0 
 20230511   12:28:27      0   9961      6      2      0      0      0 
 20230511   12:28:29      0   9961      6      0      0      0      0 
 20230511   12:28:31      0   9961      6      0      0      0      0 
 20230511   12:28:33      0   9961      6      0      0      0      0 
 20230511   12:28:35      0   9961      6      0      0      0      0 
 20230511   12:28:37      0   9961      6      0      0      0      0 
 20230511   12:28:39      0   9961      6      0      0      0      0 
 20230511   12:28:41      0   9961      6      0      0      0      0 
 20230511   12:28:43      0   9961      6      0      0      0      0 
 20230511   12:28:45      0   9961      6      0      0      0      0 
 20230511   12:28:47      0   9961      6      0      0      0      0 
 20230511   12:28:49      0   9961      6      0      0      0      0 
 20230511   12:28:51      0   9961      6      0      0      0      0 
 20230511   12:28:53      0   9961      6      0      0      0      0 
 20230511   12:28:55      0   9961      6      0      0      0      0 
 20230511   12:28:57      0   9961      6      0      0      0      0 
 20230511   12:28:59      0   9961      6      0      0      0      0 
 20230511   12:29:01      0   9961      6      0      0      0      0 
 20230511   12:29:03      0   9961      6      0      0      0      0 
 20230511   12:29:05      0   9961      6      0      0      0      0 
 20230511   12:29:07      0  10541      6      3      0      0      0 
 20230511   12:29:09      0  10673      6      0      0      0      0 
 20230511   12:29:11      0  10771      6      0      0      0      0 
 20230511   12:29:13      0  10991      6      1      0      0      0 
 20230511   12:29:15      0  10991      6      0      0      0      0 
 20230511   12:29:17      0  10649      6      1      0      0      0 
 20230511   12:29:19      0  10649      6      0      0      0      0

Does nvidia-smi show other running processes that are using memory (e.g., display drivers, other instances of PyTorch)?

I run it on a SLURM cluster. The training script is the only process that can consume more GPU.