Cpu inference - ram increases every iteration

The model I’m running causes memory to increase with every iteration.
to load it I do the following:

def _load_model(model_path):
    model = ModelDef(num_classes=35)
    model.load_state_dict(torch.load(model_path, map_location="cpu"), strict=False)
    model.eval()
    return model

to run it I do:

    gc.collect()
    with torch.no_grad():
        input_1_torch = torch.from_numpy(np.float32(input_1)).transpose(1, 2)
        res = _model(input_1_torch)
        output = res.transpose(1, 2).data.numpy() 

If I remove .eval() things get worse. if I remove torch.no_grad() things get worse.
I also used:

    for name, param in model.named_parameters():
        param.requires_grad = False

for extra safety.

I even tried to move to ONNX but I got an error that onnx caffe2 requires an interactive interpreter which I can’t use.

Any help would be very much appreciated.
Thanks Dan