Running two pytorch models on a single GPU?

Hi, I have successfully modified pix2pixHD to run real-time-ish (8FPS). But I am now trying to load two models and do inference on both at the same time. Running both models results in a Cuda out of memory error. Is there any way to specify pytorch to run in parallel (ideally within the same loop)?
Running with a RTX 2080Ti

B

For inference, I usually set the .requires_grad = False. I don’t know if that reduces the memory requirement, but you can try that and see if there is any change. I assume setting the requires_grad to False may require less memory since PyTorch does not have to keep track of the gradients for each parameter.

for param in model.parameters():
    param.requires_grad = False

In addition, you can reduce the batch-size, specially since you are doing in inference mode.