GPU memory leak using facebook maskrcnn implementation

Hello,
I’m working with the facebook implementation of faster rcnn https://github.com/facebookresearch/maskrcnn-benchmark.
In this implementation, the classification/regression are defined in https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/modeling/roi_heads/box_head/roi_box_predictors.py
I would like to do multi labels classification (assume that we have N classes), so I chose to parallelize N binary classifiers (N times the cited predictor) like this.

class Roi_head(nn.Module):
    def __init__(self, cfg, args):
        super(Roi_head, self).__init__()
        self.classes_number = N
        for i in range(1, self.classes_number +1):
            setattr(self, "roi_heads%d" % i, build_roi_heads(cfg))

    def forward(self, features, proposals, targets=None):
        labels = targets[0].get_field("labels")
        for i in range(1, self.classes_number +1):
            if torch.nonzero(labels == i).squeeze(1).shape[0] != 0:
                device = labels.device
                label = torch.ones(labels.shape, dtype=torch.float32, device=device)
                targets[0].add_field("labels", label)
                x, result, detector_losse = getattr(self, "roi_heads%d" % i)(features, proposals, targets)
                try:
                    detector_losses += sum(loss for loss in detector_losse.values())
                except:
                    detector_losses = sum(loss for loss in detector_losse.values())

But when I try to train this, I sometimes have an uncomprehending GPU memory increase (about 100mo) during backward (not for all iterations) leading to a memory overflow…
Do you have any ideas that could help me to understand what causes this leak during backward?

Thanks

I think I might encounter the similar problem. May I ask have you solved it?