I would like to minimize the following cost function

```
class SelfSupervLoss(nn.Module):
def __init__(self):
super(SelfSupervLoss, self).__init__()
def forward(self, x, edge_index):
src, dest = edge_index
return (x[src] * x[dest]).mean()
criterion = SelfSupervLoss()
```

and I want to use SGD for this purpose

```
from torch.optim import SGD
optimizer = SGD(model.parameters(), lr = 0.01)
```

The only parameters of my `model`

are:

```
self.hv = torch.nn.Parameter(data.x[data.edge_index[0]]) # argument is a Tensor with the shape [472, 6]
self.hu = torch.nn.Parameter(data.x[data.edge_index[1]]) # argument is a Tensor with the shape [472, 6]
```

which I declare in the subclass of `torch.nn.Module`

(If needed I can share the whole model)

Training loop is given as follows:

```
def train():
optimizer.zero_grad()
out = model(data.x, data.edge_index, deg)
loss = criterion(out, data.edge_index)
loss.backward(retain_graph = True)
optimizer.step()
return loss
```

But the loss remains on the same level over 200+ epochs

```
Epoch: 0, Loss: 0.1043
Epoch: 1, Loss: 0.1087
Epoch: 2, Loss: 0.0914
Epoch: 3, Loss: 0.1007
Epoch: 4, Loss: 0.0994
...
```

I do not understand what I can be doing wrong. Maybe the declaration of the parameters? I am not sureâ€¦ Does anybody know?

Thank you!