In-place gradient error

penguinshin · November 24, 2017, 7:09am

Hi, I’ve noticed that updating part of bij, i.e. bij[i] = something, makes the backward break. When I change it to a list over i, it works. I read another post confirming this, but is this really the behavior that pytorch is exhibiting? Is there no way to iterate over part of the tensor and make changes without converting it into a list first? Below is the broken code, check out the 3rd line from the bottom.

def route(self, x, n_iter = 3):
    x = x.view(-1, 10, 1152, 16)

    outputs = Variable(torch.FloatTensor(x.size(0), 10,16))
    bij = self.bij.clone()
    self.bs = []
    for r in range(n_iter):
        #self.bs.append(self.bij.clone())
        cij = F.softmax(bij, dim=1)
        for i in range(0,10):
            v = cij[i].matmul(x[:,i])
            v = self.squash(v)
            outputs[:,i] = v.clone()
            
            for d in range(0, x.size(0)):
                bij[i] = bij[i].clone() + torch.matmul(x.clone()[d,i,:], v.clone()[d])
        #return outputs
    return outputs