Hello,
I am using torchsort.soft_rank to get ranked indices from the model output logits, and then calculate a loss function for TSP (Travelling Salesman Problem) as follows. (I had previously tried it with argsort but switched to soft_rank as it was not differentiable).
points = torch.rand(args.funcd, 2).cuda().requires_grad_()
def path_reward(rank):
#a = points[rank]
a = points.index_select(0, rank)
b = torch.cat((a[1:], a[0].unsqueeze(0)))
return pdist(a,b).sum()
rank = torchsort.soft_rank(logits, regularization_strength=0.001).floor().long() - 1
However, network weights are still not getting updated by optimizer. I suspect this may be because index_select, as I have read that it is non-differentiable with respect to the index. Could you recommend an alternative solution for solving TSP by logit outputs?
Note: The model perfectly learns when I replace it with another loss function like Schwefel. Hence, I have just shared how I calculate the loss / reward as that seems to be where a solution needs to be developed to have the optimizer train / update the model properly.
Happy new year.
Sincerely,
Kamer