Cpu inference - ram increases every iteration

The model I’m running causes memory to increase with every iteration.
to load it I do the following:

def _load_model(model_path):
    model = ModelDef(num_classes=35)
    model.load_state_dict(torch.load(model_path, map_location="cpu"), strict=False)
    return model

to run it I do:

    with torch.no_grad():
        input_1_torch = torch.from_numpy(np.float32(input_1)).transpose(1, 2)
        res = _model(input_1_torch)
        output = res.transpose(1, 2).data.numpy() 

If I remove .eval() things get worse. if I remove torch.no_grad() things get worse.
I also used:

    for name, param in model.named_parameters():
        param.requires_grad = False

for extra safety.

I even tried to move to ONNX but I got an error that onnx caffe2 requires an interactive interpreter which I can’t use.

Any help would be very much appreciated.
Thanks Dan


The .data should not be necessary as you’re in a no_grad() block. Same for setting the paremeters not to require gradients.
How do you measure memory usage?
Can you give a bit more context on your code? Does the model save some states by any chance?
Or do you save some outputs for later processing?

I removed t".data" and requires_grad = False. No change.
I am measuring memory using ubuntu’s “system monitor” and glances.

I don’t think my code has any states. It has a couple of static variables other than nn.modules.
The model is mostly sequential convolutions, normalizations, multiplications, sums…etc. Nothing too crazy. It’s not an ensemble or GAN or anything like that.
When training I was using a GPU and it would run iterations including validations for days without issues.

This is an example of the first 5 or so iterations. It continues to go up until the process dies from memory issues.

That is unexpected indeed.
Do you think you can provide a small code sample that reproduces the issue?

I can share with you the code by email if you promise to delete it afterwards and not share it with anyone else.
My email is ~~

It might be better if you can reduce the issue to a smaller code sample before sharing it?

  • Add around your inference loop a print for memory usage change (using something like https://stackoverflow.com/questions/938733/total-memory-used-by-python-process for example).
  • Make a copy of your code out of your main repo
  • Replace your actual dataset with a bunch of input = torch.rand(some_hardcoded_size) inside the evaluation loop
  • Remove your instrumentation code (evaluating the accuracy / plotting etc)
  • Replace your model with a randomly initialized version (don’t load the state dict)
  • Replace your model with a single Linear layer that perform the mapping you want (with some views maybe)
  • Change the input/output sizes so that tests run quickly

At each step when you remove something, make sure that the memory behavior is still bad.
Let me know how far you can get. I may be able to look at your code but if it is big, it might take some time to reproduce / pinpoint the problem with the above process as I won’t be familiar with the code.

1 Like

I’ll give those steps a try. I’ll let you know if I got it / if I need more help.

1 Like

Well this is embarrassing but it really seems to not be the fault of pytorch. If I run random inputs with constant size directly through the net it doesn’t increase the ram. I still haven’t found the real issue so I’ll keep this open incase it turns out to be some weird combination of things that include the net.

Happy to hear that you’re finding new stuff !

So it might be your dataloader? Let me know if you have some of code or torch element you’re not sure about.

So after some more debugging I found that if I switch to pytorch cpu the ram stays stable.
So it looks like it was a pytorch related bug after all. None the less I loved your keep it simple stupid debugging methods. I think i’ll hang them up on the wall.



Hey, I’ve installed pytorch cpu and my RAM keeps increasing on inference. Any clues?

Edit: actually, the problem only exists when I’m in jupyter notebook… If I do inference by running a .py file from terminal it works fine…