Undestanding of mergin two modules

Hello. I have a little problem and because of my quite low experience in Torch I completely can’t understand what I’m doing wrong.

I have a graph neural network formed as a Module. To train the model, I need to give it examples and adjancy matrix. So, I trained the networks and saved the internal params.

Next I create another nn.Module class, that have adjency matrix as nn.Parameter. Within forward-method of this class I do some stuff with adjency matrix parameter and give it to the GNN, that already trained, with one fixed example.
I just want to learn adjency matrix parameter in the second class, so I turn off the required_grad
in all params in the GNN keeping only adjency. Unfortunately, an error is occurred when it trying to compute the loss function. It seems, that adjency matrix parameter doesn’t participate in the computation.
How can I fix this?

Best regards.

Here is the code:

import torch
import torch.nn.functional as F
import torch.nn as nn

class SecondClass(nn.Module):
    def __init__(self, adj):
        super(SecondClass, self).__init__()
        self.M = torch.Tensor(*adj.size()).uniform_(0,0.01)
        self.M = self.M.cuda()
        self.M = nn.Parameter(self.M)
        
    def forward(self, model, features, adj):
        adj__ = adj.mul(torch.sigmoid(self.M))
        output = model(features, adj__)
        return output

def loss(outputs, labels):
    # The loss computes only for one fixed sample during all training process
    output = outputs[0,:]
    output = output.squeeze(0)

    label  = labels[0]
    output = output[label.data.item()]
    output = -output
    return output


# adj is a matrix shaped (2708, 2708),  features has shape (2708, 1433)
adj, features  = load_data()
# restoring the gnn model
gnnmodel = GNNModel(init_params)
gnnmodel.load_state_dict(torch.load('{}.pkl'.format(571)))

sec_model = SecondClass(adj)
# adjust the params
for param in gnnmodel.parameters():
        param.requires_grad = False

for param in sec_model.parameters():
        param.requires_grad = True

optimizer = optim.Adam(sec_model.parameters(), 
                       lr=0.00001, 
                       weight_decay=5e-4)

sec_model.train()
optimizer.zero_grad()

for i in range(100):
    # give the SecondClass model GNN model itself, features and adjency matrix from wich the parameter will be formed
    output = sec_model(gnnmodel, features, adj)
    # Here idx_train and idx_train are some slice arrays.
    loss_train = loss(output[idx_train], labels[idx_train])
    # Here is an error 
    #RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
    loss_train.backward() 
    optimizer.step()
    print("Loss:{}".format(loss_train.data.item()))

Hi,

Before answering your question, in pytorch, the gradients are not automatically reseted to 0 and so you need to do it yourself. Given your code, it should be just before you call loss_train.backward().

I’m confused how the code you have works.
adj is defined as an empty list
It is passed to SecondClass that tries to extract some size information about it. how does that not crash?

Oh, sorry. The adj and features are got their value from the function that load data. I added it to the code. So the adj is a square matrix, that defines the graph structure and features is variable that contains information about features of every node.

The idea here, that I need to replace the adj by the trainable parameter, so I wrap up the model in the Second class, where I create a such parameter and give it to the model. I only need to train the parameter in the SecondClass freezing parameters of GNNmodel.

Thank you for answering.

The code looks good.
Can you share the exact error message and stack trace?
Also make sure that the model uses adj__ in a differentiable manner.