I’m new to Optimal transport theory. I wish to minimize the domain gap in two data distributions (two datasets) using a metric which should be differentiable as I wish to use it as loss func. I found Sinkhorn distance (Wasserstein) to be interesting but couldn’t find an implementation outside GANs. I’m training my networks in an adversarial manner but still it’s not efficient. How can I use something from OT to minimize the gap between two distributions as loss ?
This comes recommended for pytorch https://github.com/jeanfeydy/geomloss
A more mature and comprehensive implementation of all things OT related is https://github.com/rflamary/POT, however they do not yet have pytorch support. That on their todo list
Indeed the Geomloss package is really efficient to compute entropic variants of Optimal Transport. You can have access to the entropic regularized OT or the Sinkhorn divergence. They also provide typical ground cost but you can compute your own. I would suggest to use the Sinkhorn divergence instead of the entropic regularized OT. The later interpolates between OT and MMD losses while the former does not have the separability axiom (i.e OT_eps(a,a) > 0). You can find more details in : https://arxiv.org/abs/1810.08278
the library below also implements OT through the sinkhorn algorithm
I am using the Geomloss package to compute the Sinkhorn distance. my inputs are two distributionS(DATA1 AND DATA2) with batch size of 64 and 99 dimension. how I can pass them into the loss to compute , should I pass them as a vector with size of 649*9?
loss = SamplesLoss(loss="sinkhorn", p=2, blur=.5) L = loss(DATA1.float(),DATA2.float())