RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 7.43 GiB total capacity; 6.71 GiB already allocated; 6.81 MiB free; 6.72 GiB reserved in total by PyTorch)

Unfortunately it’s difficult to see what the issue is since we don’t have knowledge of what is in the train() function. Could you give some more details about where exactly the OOM happens (e.g., is it in the training function?) and if it’s a specific allocation (e.g., copying data to device, or moving the model to device, etc.).