Cuda out of Memeory Errro when Loading ImageNet dataset

I have a problem with loading ImageNet data. I want to extract features from the images and save them in a matrix (# of images, feature size). My GPU is Nvidia Quadro RTX 6000 with 24 GB ram. I get a “cuda out of memory” error when I run my code.
I tried to fix it by doing these things:

  1. lowering the batch size (from 256 to 4) and the number of workers (from 8 to 2)
  2. setting max_split_size_mb =2
  3. clearing the cache and deleting unused variables
    But none of them worked. How can I solve this problem? Did I make any mistakes?

How many features do you want to save and how large is each feature as storing these features alone could already use a large portion of the GPU memory (besides the model parameters etc.).
Also, did you .detach() each feature before storing it or ran the forward pass in a with torch.no_grad() guard to avoid storing the intermediates?