Increase in RAM when using torchvision's FasterRCNN model possibly due to Memory leak

Hi Everyone!
I have been using torchvision detection model and i see that RAM keeps increasing over a period of time and end up getting below error

CUDA error: an illegal memory access was encountered\nCompile with TORCH_USE_CUDA_DSA to enable device-side assertions

i have model hosted behind an endpoint and load it on to GPU only during init() of the fastapi service with number of workers set to 5 and during prediction i am using below code:

def load_model(cls):
    if torch.cuda.is_available():
        logger.info("CUDA Available, Using GPU")
        device = torch.device("cuda:0")
    else:
        logger.info("CUDA NOT Available, Using CPU")
        device = torch.device("cpu")

    try:
        # Load the JSON file and get the model type
        with open(os.path.join(MODEL_DIRECTORY_PATH, MODEL_CONFIG)) as f:
            data = json.load(f)
            model_type = data.get('model_type')

        if model_type not in MODEL_TYPE_DICT:
            logger.error(f"Model type '{model_type}' is not supported.")
            raise ValueError(f"Model type '{model_type}' is not supported.")

        model_ctor = MODEL_TYPE_DICT[model_type]
        labels = data['labels']

        logger.info(f"Loading Faster R-CNN Model, type: {model_type}")
        model = model_ctor(pretrained=False, num_classes=len(labels))

        model.load_state_dict(torch.load(os.path.join(MODEL_DIRECTORY_PATH, MODEL_NAME), map_location=device))

        model = model.to(device)
        model.eval()
        logger.info("Loaded Faster R-CNN Model")

def perform_prediction(img: Image, model, device, input_labels):
logger.info(“Started performing prediction”)
width, height = img.size
img_t = transform_image(img, device)
with torch.no_grad():
prediction = model(img_t)[0]
boxes, labels, scores = process_prediction(prediction)
return boxes, labels, scores

Pretty much the standart way of doing things
i’m using pytorch version: torch = “2.0.1”
torchvision = “0.15.2”

i have seen people reporting it earlier as well, but i didn’t get a clearer view on how to go about solving the problem. can someone who has faced this earlier help me out with their approaches