Suggestions for making this equation autograd-able?

LOGUNIVPM · July 30, 2024, 3:51pm

Hi, I am dealing with a spintronic problem where the loss is a Hamiltonian of the type:

        energy = 0
        for i in range(self.Npbits):
            tmp = 0.5 * torch.dot(self.J[i], pbits) + self.h[i]
            energy -= tmp * pbits[i]
        return energy

J is a square matrix, and J[i] one of its lines. pbits is a vector of the same length.
I doubt this solution will not allow the gradient to be automatically computed because of the “for” loop and because I’m taking the rows of J by indexing. Am I wrong? Is there a definitive guide to writing custom code that can be autograd-able?

Best

ptrblck · July 30, 2024, 4:46pm

The loop and the indexing operation won’t detach the tensor from the computation graph. Did you see any issues using your approach?

KFrank · July 30, 2024, 10:44pm

Hi Leonardo!

As an aside, you can rewrite this without the loop:

>>> import torch
>>> print (torch.__version__)
2.4.0
>>>
>>> _ = torch.manual_seed (2024)
>>>
>>> Npbits = 5
>>> J = torch.randn (Npbits, Npbits)
>>> pbits = torch.randn (Npbits)
>>> h = torch.randn (Npbits)
>>>
>>> energy = 0   # compute with loop
>>> for  i in range (Npbits):
...     tmp = 0.5 * torch.dot (J[i], pbits) + h[i]
...     # print (tmp.shape)
...     energy -= tmp * pbits[i]
...
>>> # loop-free version
>>> energyB = -(0.5 * torch.einsum ('ij, i, j ->', J, pbits, pbits) + torch.einsum ('i, i ->', h, pbits))
>>>
>>> energy
tensor(4.5874)
>>> energyB
tensor(4.5874)

Best.

K. Frank

LOGUNIVPM · July 31, 2024, 4:04pm

Thank you! Indeed I was approaching that kind of solution with matmul e dot, I didn’t know einsum, very useful…
The solution works and the network backpropagates effectively.

Since you’ve been so kind, I’m going to bother you with another similar problem
This time the gradient is harder to compute, since it uses a sign function (which I replaced with a tanh and a strong multiplying factor) and conditionals inside a for loop.

The original code is the following:

    loss = 0
    pbits_variables = self.pbits[self.index_variables]
    for i, clause in enumerate(self.clauses):
        truth_value = 0
        for variable in clause:
            if pbits_variables[np.abs(variable) - 1] * np.sign(variable) > 0:
                truth_value = 1
                break
        self.loss += (1 - truth_value)

clauses are a list of rows of numbers (inside a list), therefore for each number in a clause the if checks whether the > 0 condition is verified, and in that case, it breaks and says that truth_value = 1. I can do the same checking the condition on the array and using clamp, so that the truth_value is 1 if at least one element in the array is > 0.

I have devised this code:

    pbits_variables = pbits[:,self.index_variables]
    cost = torch.zeros(1, requires_grad=True)
    for i, c in enumerate(self.clauses):
        truth_value = torch.zeros(1, requires_grad=True)
        clause = torch.from_numpy(np.array(c))
        v = pbits_variables[:,torch.abs(clause) - 1] * nn.functional.tanh(DENORM_BETA*clause)
        truth_value = torch.clamp(torch.sum(v > 0), max=1)
        cost += (1 - truth_value)

I’ve probably missed something since the last line bumps this error message.

  RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

AlphaBetaGamma96 · July 31, 2024, 4:15pm

You shouldn’t have any in-place operations within your function, so remove any += or *= etc. and see if that resolves the issue.

LOGUNIVPM · August 1, 2024, 8:10am

You are right, just using cost = cost + … instead of += is the way to go.
I am facing convergence issues but that is probably related to the nature of the problem. I can see a nonzero gradient flow through the network, therefore the loss as I coded it should be able to compute the gradient automatically. This seems to me quite sound… (PS: the plot is done according to this old discussion Check gradient flow in network - #7 by RoshanRane)