Growing allocated memory issue for every batch during training

rf6025 · July 2, 2024, 8:51am

Hello! I am using a custom Dataloader (see image below for reference) designed to access the dataset in a “sliding window” manner. The max batch size is set to 32 for training.

My custom Dataloader with a sliding window:

The discussion forum that inspired my Dataloader’s design is linked here: DataLoader for a LSTM Model with a Sliding Window

During training, the amount of memory allocated is printed for each batch. I noticed that the allocated memory amount would gradually grow for every batch (e.g. Batch 1 has ~132 Mb allocated, Batch 2 has ~137 Mb allocated, Batch 3 has ~142 Mb allocated, and so on). This memory allocation issue still occurs when the max batch size is reached, but the amount of allocated memory doesn’t grow as much as before (e.g. Batch 32 has ~304 Mb allocated, Batch 33 has ~306 Mb allocated, Batch 34 has ~308 Mb allocated, and so on).

Note that I am using a NVIDIA GeForce RTX 3090 GPU that has 24 Gb. Any advice or help is appreciated.

ptrblck · July 2, 2024, 3:40pm

Could you post a minimal and executable code snippet reproducing the issue, please?

rf6025 · July 11, 2024, 9:05am

Apologies for the late reply, I was having trouble recreating the same issue. I realized that I mistook my custom Dataset I shared in my original post as the Dataloader when it isn’t. Thus, my training results were gotten from using just my Dataset class and no Dataloader. I’m assuming the memory issue was caused by not using a Dataloader, so I am now trying to get my custom Dataset to work with the Pytorch Dataloader.

Below is a simple code to test the Dataset with the Pytorch Dataloader. When I tried testing my custom Dataset, it wouldn’t execute. But when I tried using the LSTM sliding window Dataset (see link to that dataset in original post), it would execute.

Simple Execution Code for Testing (Dataset is the custom Dataset class)

from torch.utils.data import DataLoader
import albumentations as A
import numpy as np

B, H, W    = 100, 32, 32                          # Batch, Height, Width
imgset     = np.random.rand(B,H,W,3)              # Dummy numpy datasets
labset     = np.random.rand(B,H,W,1)

transform  = A.Compose([ToTensorV2()])            # Data transformation

trainset   = Dataset(imgset, labset, transform)   # Process data into trainset   
dataloader = DataLoader(trainset, batch_size = 1, shuffle=False, pin_memory=True)  # Dataloader

# Print each batch
for i, (img, msk) in enumerate(dataloader):
    print(f"Batch {i}  \tImages: {img.shape}\t  Masks: {msk.shape}")

However, the LSTM sliding window Dataset didn’t make batches the way I wanted. Image A below shows what the batch outputs look like made by the LSTM sliding window Dataset. The batch outputs I am aiming to create can be seen in Image B below, which is similar to Image A but the sliding window starts with one image rather than the first batch size number of images. Note that for Image A and Image B, each row is a batch, each element is the index of one image, the maximum batch size is 4, and dataset size is 10.

Image A: Type of batches made by the LSTM sliding window Dataset :

Image B: Type of batches I want to produce:

I’m currently looking more into the Pytorch Dataloader to see if there is any way I can get the sliding window to work like in Image B (I am experimenting with a custom Batch Sampler with no success so far). If you or anyone else have any ideas or advice on achieving this, I would greatly appreciate it! Thank you!

ptrblck · July 11, 2024, 10:32pm

A simple Dataset example using a sliding window can be found here but note that the sequence length would be static and would not increase (up to 4) as in your example image B.