Backward on scatter_reduce raising an error

loryruta · April 15, 2024, 7:28pm

Hi all,
I’m trying to back-propagate the following code (snippet):

# Dummy initialization for the sake of the post
batch_size = 200
batch_indices = torch.empty((5000,))
node_hiddens = torch.empty((5000, 100))

# Actual code
mol_repr = torch.zeros((batch_size, 100,))
mol_repr = mol_repr.scatter_reduce(
    dim=0,
    index=batch_indices.unsqueeze(1),  # (5000, 1,)
    src=node_hiddens,  # (5000, 100)
    reduce='mean'
)
return mol_repr

The idea is that I have a bunch of node_hiddens that I’ve already computed earlier, and I have a batch_indices tensor telling where the i-th node_hiddens should go in the final mol_repr. node_hiddens with the same index should average out (i.e. reduce='mean').

However, I get:

RuntimeError: Function ScatterReduceBackward0 returned an invalid gradient at index 1 - got [5000, 1] but expected shape compatible with [5000, 100]

as I have very little idea about the autograd inner workings, I can’t wrap my head around this error.
Could anyone help me? Thank you,

Separate question that regards the same code:
the snippet above is from def forward(...) function. To my understanding, mol_repr doesn’t need requires_grad=True on initialization as I don’t want to accumulate gradient on it to perform gradient update (i.e. it’s not a parameter!). Is my guess correct?

ptrblck · April 16, 2024, 12:07am

You could work around this issue via:

index=batch_indices.unsqueeze(1).repeat(1, 100),

as the unsqueezed index seems to cause the issue.

loryruta · April 16, 2024, 5:57pm

Thank you,
finally I went for using index_reduce, which is backpropagating correctly (at least, without raising errors).
I should have read more carefully scatter_reduce_ documentation:

The backward pass is implemented only for src.shape == index.shape.