Gradient is always zero with torch.cdist?

vnx_int · July 11, 2022, 6:16pm

Hi everyone! I’m new to pytorch and I’m working on a custom model based on this medium article, but for some reason my weights aren’t updating. When I try to print out the gradients of the weights as they are training I simply end up seeing zero tensors. My model and the training loop is shown below:

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        weights = torch.zeros([n_vec, 2])
        self.weights = nn.Parameter(weights)        
        
    def forward(self):
        return torch.cdist(self.weights, self.weights, p=2)
    
def training_loop(model, optimizer, n=1000):
    losses = []
    for i in range(n):
        preds = model()
        loss = torch.sum((preds - x) ** 2)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        losses.append(loss)  
    return losses
m = Model()
opt = torch.optim.Adam(m.parameters(), lr=0.005)
losses = training_loop(m, opt)

I can’t figure out why this is, the only thing I can think of is that torch.cdist is somehow messing up the autograd. Any help would be greatly appreciated!

ptrblck · July 11, 2022, 7:20pm

You are initializing the weights with zeros and torch.cdist should thus also return zeros only.
I don’t know what x is, but I would assume if you initialize the self.weights parameters randomly, you should see valid gradients.