Wasserstein loss layer/criterion

AjayTalati · March 23, 2017, 11:04am

Hi @tom, I just saw one of Marco Cuturi’s recent papers, and if I understood correctly, it gives a method to calculate the divergence between distributions using something like SGD, rather than Sinkhorn’s algorithm

That’s much more understandable, but I’m not sure how easy it is to implement? Basically, it would involve constructing a layer which itself would involve a sgd loop! Pretty funky

I know that there’s already a leanrable quadratic programming layer that’s been implemented, https://github.com/locuslab/qpth but this seems more general than that

Anyway here’s the link, "Stochastic Optimization for Large-scale Optimal Transport "

https://arxiv.org/abs/1605.08527