Problem about scatter_

Hi all,

I want to transfer a sequence representation as the node representation of size (batch_size, num_nodes, hidden_size) for a graph model. I use scatter_ to assign the sequence embedding of correct indexes to the node representation. And the node_embedding is finally used for graph-level prediction. However, the model seems not work as the accuracy never updates in training.

Here is the code

import torch
batch_size = 2
num_nodes = 9
embedding_dim = 4
hidden_size = 4

id_embedding = torch.nn.Embedding(num_embeddings=num_nodes + 1, embedding_dim=embedding_dim, padding_idx=0) # +1 for padding index 0

# input batch sequence
batch_sequence = torch.LongTensor([
    [5, 6, 7, 8, 9], # 5
    [1, 2, 3, 0, 0]  # 3
])
lengths = torch.LongTensor([5, 3])

seq_model = torch.nn.GRU(embedding_dim, hidden_size, batch_first=True)

x = id_embedding(batch_sequence)

pack = torch.nn.utils.rnn.pack_padded_sequence(x, lengths, batch_first=True)
out, hn = seq_model(pack)
seq_embedding, _ = torch.nn.utils.rnn.pad_packed_sequence(out, batch_first=True)

node_embedding = torch.zeros(batch_size, num_nodes + 1, hidden_size)
node_embedding.scatter_(1, batch_sequence.unsqueeze(-1).expand(-1, -1, seq_embedding.size(-1)), seq_embedding)

the node_embedding will be like

tensor([[[ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.2699,  0.3514,  0.2468,  0.0029],
         [ 0.4198,  0.2088,  0.2488,  0.1672],
         [-0.1704, -0.0910,  0.4044,  0.2269],
         [ 0.2420, -0.2462,  0.7400,  0.3019],
         [ 0.2985, -0.3831,  0.5072,  0.3748]],

        [[ 0.0000,  0.0000,  0.0000,  0.0000],
         [-0.2693,  0.2609, -0.1224,  0.1326],
         [ 0.0555,  0.2357, -0.1478,  0.2386],
         [ 0.4334,  0.4978, -0.2275,  0.1967],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000]]], grad_fn=<ScatterBackward0>)

Thanks in advance!

Hi,

This code looks good to me.
The gradient will flow back from node_embessing to seq_embedding and all the way through your model. Do you have specific observations that make you think that this is an autograd problem?

Thanks for your reply!

I found that the accuracies are unusual. Not sure the underlying behavior.

train batch loss: 0.6967262625694275: 100%|██████████| 282/282 [00:32<00:00,  8.73it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 16.44it/s]
train batch loss: 0.6877152919769287: 100%|██████████| 282/282 [00:30<00:00,  9.12it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.53it/s]
train batch loss: 0.6955962181091309: 100%|██████████| 282/282 [00:33<00:00,  8.53it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.29it/s]
train batch loss: 0.6605843901634216: 100%|██████████| 282/282 [00:31<00:00,  9.09it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.15it/s]
train batch loss: 0.6882678270339966: 100%|██████████| 282/282 [00:32<00:00,  8.80it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.23it/s]
train batch loss: 0.6663910746574402: 100%|██████████| 282/282 [00:31<00:00,  9.01it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.27it/s]
train batch loss: 0.6218530535697937: 100%|██████████| 282/282 [00:31<00:00,  8.97it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.37it/s]
train batch loss: 0.6685017943382263: 100%|██████████| 282/282 [00:31<00:00,  9.01it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.32it/s]
train batch loss: 0.6893523335456848: 100%|██████████| 282/282 [00:31<00:00,  9.02it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.29it/s]
train batch loss: 0.6964542865753174: 100%|██████████| 282/282 [00:31<00:00,  8.93it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.28it/s]
train batch loss: 0.6830093264579773: 100%|██████████| 282/282 [00:31<00:00,  9.09it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.53it/s]
train batch loss: 0.6640526652336121: 100%|██████████| 282/282 [00:31<00:00,  9.03it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.26it/s]
train batch loss: 0.6821056604385376: 100%|██████████| 282/282 [00:32<00:00,  8.63it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.35it/s]
train batch loss: 0.6885587573051453: 100%|██████████| 282/282 [00:31<00:00,  8.84it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.14it/s]
train batch loss: 0.6852195858955383: 100%|██████████| 282/282 [00:31<00:00,  8.83it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 16.99it/s]
train batch loss: 0.6919611692428589: 100%|██████████| 282/282 [00:32<00:00,  8.55it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 16.14it/s]
train batch loss: 0.682405412197113: 100%|██████████| 282/282 [00:33<00:00,  8.49it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.14it/s]
train batch loss: 0.6876314282417297: 100%|██████████| 282/282 [00:34<00:00,  8.19it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.23it/s]
train batch loss: 0.6916389465332031: 100%|██████████| 282/282 [00:31<00:00,  9.08it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.35it/s]
train batch loss: 0.695244312286377: 100%|██████████| 282/282 [00:30<00:00,  9.13it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.61it/s]
train batch loss: 0.6778611540794373: 100%|██████████| 282/282 [00:30<00:00,  9.13it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.69it/s]
train batch loss: 0.7030342817306519: 100%|██████████| 282/282 [00:30<00:00,  9.12it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.61it/s]
train batch loss: 0.6811032891273499: 100%|██████████| 282/282 [00:30<00:00,  9.20it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:03<00:00, 17.82it/s]
train batch loss: 0.7325484156608582: 100%|██████████| 282/282 [00:30<00:00,  9.19it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.52it/s]
train batch loss: 0.6639286279678345: 100%|██████████| 282/282 [00:30<00:00,  9.20it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.74it/s]
train batch loss: 0.6750177145004272: 100%|██████████| 282/282 [00:30<00:00,  9.18it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.71it/s]
train batch loss: 0.7046581506729126: 100%|██████████| 282/282 [00:31<00:00,  8.96it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.24it/s]
train batch loss: 0.7280897498130798: 100%|██████████| 282/282 [00:31<00:00,  8.99it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.30it/s]
train batch loss: 0.6575214266777039: 100%|██████████| 282/282 [00:31<00:00,  9.04it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.48it/s]
train batch loss: 0.6623131632804871: 100%|██████████| 282/282 [00:31<00:00,  9.09it/s]
test acc: 0.54924578527063: 100%|██████████| 71/71 [00:04<00:00, 17.44it/s]
train batch loss: 0.7028350234031677: 100%|██████████| 282/282 [00:31<00:00,  9.07it/s]
test acc: 0.4010647737355812: 100%|██████████| 71/71 [00:04<00:00, 17.61it/s]
train batch loss: 0.7099341750144958: 100%|██████████| 282/282 [00:31<00:00,  8.96it/s]
test acc: 0.4010647737355812: 100%|██████████| 71/71 [00:04<00:00, 17.39it/s]
train batch loss: 0.7120646238327026: 100%|██████████| 282/282 [00:31<00:00,  9.06it/s]
test acc: 0.4010647737355812: 100%|██████████| 71/71 [00:04<00:00, 17.45it/s]
train batch loss: 0.6877050399780273: 100%|██████████| 282/282 [00:31<00:00,  9.08it/s]
test acc: 0.4010647737355812: 100%|██████████| 71/71 [00:04<00:00, 17.53it/s]
train batch loss: 0.6763968467712402: 100%|██████████| 282/282 [00:31<00:00,  9.04it/s]
test acc: 0.4010647737355812: 100%|██████████| 71/71 [00:04<00:00, 17.28it/s]

It might also be a problem with your model not being expressive enough to push the accuracy up?
But from a pure autograd point of view, I don’t think there is any issue with using scatter.