GPU Memory Leak in for loop

When I run my trainig code, gpu memory usage increased every iteration

for iSentence, sentence in enumerate(shuffledData):

    if iSentence % 100 == 0 and iSentence != 0:
              
        print "check"
               
    conll_sentence = [entry for entry in sentence if isinstance(entry, utils.ConllEntry)]
    conll_sentence = conll_sentence[1:] + [conll_sentence[0]]
    self.model.getWordEmbeddings(conll_sentence, True)
    del conll_sentence
    gc.collect()

In getWordEmbeddings function:

getWordEmbeddings(sentence, train):

    for root in sentence:
        c = float(self.wordsCount.get(root.norm, 0))
        dropFlag =  not train or (random.random() < (c/(0.25+c)))
        
        root.wordvec = self.wlookup(scalar(int(self.vocab.get(root.norm, 0))) if dropFlag else 
                          scalar(0)).cuda()
        root.posvec = self.plookup(scalar(int(self.pos[root.pos]))) if self.pdims > 0 else None

        root.evec = None
        root.ivec = cat([root.wordvec, root.posvec, root.evec])

    
      forward  = RNNState(self.surfaceBuilders[0])
      backward = RNNState(self.surfaceBuilders[1])

        for froot, rroot in zip(sentence, reversed(sentence)):
            
            forward = forward.next(froot.ivec)
            backward = backward.next(rroot.ivec)
            froot.fvec = forward()
            rroot.bvec = backward()

        for root in sentence:
            root.vec = cat([root.fvec, root.bvec])

      
            bforward  = RNNState(self.bsurfaceBuilders[0])
            bbackward = RNNState(self.bsurfaceBuilders[1])

            for froot, rroot in zip(sentence, reversed(sentence)):
               
                bforward = bforward.next(froot.vec)
                bbackward = bbackward.next(rroot.vec)
                froot.bfvec = bforward()
                rroot.bbvec = bbackward()

            for root in sentence:
                root.vec = cat([root.bfvec, root.bbvec])

I think the GPU Memory leak is caused by conll_sentence in every iteration because if I type

yield conll_sentence

after self.model.getWordEmbedding function, the gpu memory keep steady. I don’t know is it caused by getWordEmbedding function or caused by the codes in for loop.

I have tried to solve this problem by del objects or use gc, but none of them worked. Really hope someone to help me solve the problem