Need help with custom forward pass through network

Hello,
I am trying to train a network in a similar was as proposed in https://arxiv.org/abs/1703.05175 (prototypical networks).
I think I have either misunderstood the paper or something strange is happening with my gradients.

Each mini-batch in my network contains (10-32) datapoints / classes to classify, and 3 examples from each class to compute class ‘centres’.
Lets say that there are always 10 classes. Then I compute cosine similarity between each of the 10 datapoints and 10 centres and use a softmax to normalize.

The labels for each mini-batch are always 1-10. Network is trained by minimizing NLL. My loss computes fine, but when I call backward() on the loss, I get an error : none of the leaf nodes require gradients
To check if I was doing something really silly, If I feed ‘features’ (in code snippet below to a linear layer with 10 outputs, and feed that to my loss, then backward works fine. So I guess something is going wrong when I compute the cosine distances. I am doing this within the forward pass:

                    features = self.fc(x)
153 
154                 enroll_embs = features[:enr,:]
155                 test_embs = features[enr:,:]
156 
157                 output = self.output(enroll_embs) #for sanity check
158 
159                 speaker_centers = torch.Tensor(enroll_embs.size(0),2048)
160 
161                 ptr=0
162                 for ind in range(enroll_embs.size(0)):
163                         center = test_embs[ptr:ptr+3,:]
164                         center = torch.mean(center,0)
165                         speaker_centers[ind,:] = center.data
166                         ptr+=3
167 
168                 speaker_centers = Variable(speaker_centers.cuda())
169                 cosine_scores = Variable(torch.Tensor(enroll_embs.size(0),32).cuda())
170 
171                 for enr in enroll_embs:
172                         rep_enr = enr.repeat(enroll_embs.size(0),1)
173                         cosine_score = F.cosine_similarity(rep_enr,speaker_centers)
174                         #normalized_scores = F.log_softmax(cosine_scores)
175                         cosine_scores[ind,:] = F.log_softmax(cosine_score.data)                                                                                                                                                           
176 
177 
178                 return cosine_scores,output

Any advice would be most appreciated.

Thanks

It would be better to implement it in this way:

cosine_scores = []
for enr in enroll_embs:
     cosin_scores.append(.....)

cosin_scores = torch.stack(cosine_scores)


hi @chenyuntc,

Thanks for the reply, I still get
RuntimeError: 'there are no graph nodes that require computing gradients’
when I pass cosin_scores to my loss function and call backward()

Hi @chenyuntc I figured it out.
I think I was not being consistent before - i.e. everything was not a ‘Variable’. I can now call backward

Thanks again