Autograd for Optimal Transport distance


I’m currently working on optimal transport distance for the inverse problem. I’m using PyTorch loss.backward() for computing the gradient of optimal transport distance. However, it doesn’t reflect any updates on my model. I would like to know, does PyTorch support the autograd of optimal transport distance?

Thank you.

Here I attached snapshots of my codes:

for b in batches:
        batch_xs = xs[b::BATCHES]
        batch_xr = xr[b::BATCHES]

        data_sim = pde_solver(batch_xs, batch_xr)
        data_true = noisy_data[b::BATCHES, ...]

        sim, data_sim_cdf = OT_data_normalization(data_sim, 6.0, 1.0)
        true, data_true_cdf = OT_data_normalization(data_true, 6.0, 1.0)

        with torch.cuda.amp.autocast():
            loss = OTloss(data_sim_cdf, data_true_cdf, sim, T)

        nn.utils.clip_grad_value_(pde_solver.model, clip_value=1e3)
# Data normalization function
def OT_data_normalization(data, b, c):
    ns, nr, nt = data.shape
    data = data.reshape(ns*nr, nt)

    data = torch.log(torch.exp(b*data) + 1)
    data_norm = (data + c) / torch.sum(data + c, dim=-1,keepdim=True)
    data_cdf = torch.cumsum(data_norm, dim=-1)
    return data_norm, data_cdf
# compute OT loss
def OTloss(cdf_sim, cdf_obs, dsim, t):
    idx = torch.searchsorted(cdf_obs, cdf_sim)
    idx[idx == t.shape[-1]] = -1
    tidx = t[idx]
    return ((t - tidx)**2 * dsim).sum()

I don’t see any obvious issues in your code, but as it’s not executable I also cannot verify its functionality.
You could check if all the expected parameters in your model receive a valid gradient by checking the .grad attribute before and after calling the first backward() operation. Once this is done you could then check their values or magnitude and see if they might be really small and thus almost no change would be visible.

Hi @ptrblck! Great thanks for your suggestions! Sorry I couldn’t provide the complete code here because it is rather large. I followed your suggestions, and I found out that the gradient is not correct, and of course, the value is very small as you mentioned. I’m not sure how accurate is the autograd in approximating the gradient of complicated loss w.r.t PDE parameters. I need to check this out. Thanks a lot!