Model Runs Slowly Unless I Restart Kernel

I’m pretty new to PyTorch, so please excuse my lack of knowledge on this subject. I’m running a PyTorch model inside of a Jupyter kernel. For some reason, if I try to run the model a second time, it is extremely slow. However, if I restart the kernel, it is able to run quickly again. I was wondering why this is? Also, is there a way to get the model to run quickly without having to restart the kernel after every run?

I’ve never seen such a behavior and would recommend profiling the workload e.g. with the native PyTorch profiler or Nsight Systems to check where the slowdown is coming from.

Thanks for the suggestion. I took a look at the PyTorch profiler, but I have no idea how to get it working with the model I’m using. I’m working with OpenAI’s Shap-E. Here is the code I’m using to run the model:

import torch

from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

batch_size = 4
guidance_scale = 15.0
prompt = "a shark"

latents = sample_latents(
    model_kwargs=dict(texts=[prompt] * batch_size),

latents = sample_latents( is the line that is executing extremely slowly.

Would you mind helping me profile this?

Try to separate all the lines before batch_size and do not run them more than once. I assume that heavy models are loaded into memory second time and sampling starts before garbage collector (if there is one in python) deletes previously loaded models.