Add one tensor from dataloader to optimiser

To me, what you are doing is a gradient flow. I would suggest you to have a look at the very good post on Wasserstein gradient flow here : https://www.kernel-operations.io/geomloss/_auto_examples/comparisons/plot_gradient_flows_2D.html#sphx-glr-auto-examples-comparisons-plot-gradient-flows-2d-py

In the gradient flow function, they basically do:

for i in range(Nsteps):
        # Compute loss and gradient
        L_αβ = loss(x, y)
        [g]  = torch.autograd.grad(L_αβ, [x])
        x.data -= learning_rate * len(x) * g

This way, you do not need to define an optimizer as you do it yourself ! Hope this helps :slight_smile:

1 Like