Are you storing tensors attached to the computation graph in e.g. a list?
This would not be a leak, but the increased memory usage would be expected in this case (although this behavior is commonly referred to as a “memory leak”).
Try to narrow down which part of your code is causing the increase in memory usage as e.g. save_prediction_batch and successes.append(res) might store the entire computation graph.