TSP Solver Not Learning


I am using torchsort.soft_rank to get ranked indices from the model output logits, and then calculate a loss function for TSP (Travelling Salesman Problem) as follows. (I had previously tried it with argsort but switched to soft_rank as it was not differentiable).

points = torch.rand(args.funcd, 2).cuda().requires_grad_()

def path_reward(rank):
#a = points[rank]
a = points.index_select(0, rank)
b = torch.cat((a[1:], a[0].unsqueeze(0)))
return pdist(a,b).sum()

rank = torchsort.soft_rank(logits, regularization_strength=0.001).floor().long() - 1

However, network weights are still not getting updated by optimizer. I suspect this may be because index_select, as I have read that it is non-differentiable with respect to the index. Could you recommend an alternative solution for solving TSP by logit outputs?

Note: The model perfectly learns when I replace it with another loss function like Schwefel. Hence, I have just shared how I calculate the loss / reward as that seems to be where a solution needs to be developed to have the optimizer train / update the model properly.

Happy new year.


Based on your code you are detaching rank from the computation graph by calling long() on it.
Integer types are not differentiable so you would need to stick to floating point types.