Shape mismatch between logits and labels for computing loss on 3D data in GCN

RuntimeError: Expected target size [25, 10], got [25].

I have a sensors data and I want to perform node level classification using GCN. I have k nodes (sensors). Each node has m observations (node features) and each observation is a n dimensional vector. Shape of my dataset is k,m,n, its a 3d dataset. I used DGL to convert data to graph form.

k =  25
m = 2000
n = 6
num_classes = 10

node_features = torch.randn((k, m, n)) #Generate node features.
node_labels = torch.tensor([i  for i in range (k)]) #Create tensor of labels

#Create graph
nodes= [i for i in range(k)]
edges = []
for node1, node2 in combinations(nodes, 2):
    edges.append([node1, node2])
a = np.array(edges)
src =  a[:,0]
dst = a[:,1]
g = dgl.graph((src , dst)) # create graph

g.ndata['features'] = node_features # Assign node features 
g.ndata['labels'] = node_labels # Assign labels

Above code creates a graph. Below is graph structure.

Graph(num_nodes=25, num_edges=300,
ndata_schemes={‘features’: Scheme(shape=(2000, 6), dtype=torch.float32), ‘labels’: Scheme(shape=(), dtype=torch.int64)}

Next, I build the GCN model.

import torch.nn as nn
import torch.nn.functional as F
from dgl.nn import GraphConv

class GCN(nn.Module):
    def __init__(self, in_feats, h_feats, num_classes):
        super(GCN, self).__init__()
        self.conv1 = GraphConv(in_feats, h_feats)
        self.conv2 = GraphConv(h_feats, num_classes)

    def forward(self, g, in_feat):
        h = self.conv1(g, in_feat)
        h = F.relu(h)
        h = self.conv2(g, h)
        return h

Training of model is:

num_epochs = 50 
hidden_size = 64
features = g.ndata["features"]
labels = g.ndata["labels"]

model = GCN(n, hidden_size , num_classes)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()

# Train model
for epoch in range(num_epochs):
    logits = model(g, g.ndata['features'])
    loss = criterion(logits, labels)
    print(f"Epoch {epoch + 1}, loss: {loss.item()}")

On line loss = criterion(logits, labels), following error occurs.

RuntimeError: Expected target size [25, 10], got [25].

How I can solve this problem? If I pass 2 dimensional data to this model, it works fine but my data is 3 dimensional.

Note: Shape of logits is torch.Size([k, m, num_classes ]) while shape of labels is torch.Size([k]).

nn.CrossEntropyLoss expects a model output containing class logits in the shape [batch_size, nb_classes, *] and a target in the shape [batch_size, *] containing class indices in the range [0, nb_classes-1], where the * denotes additional dimensions.
Your current output in the shape [k, m, n] indicates you are working with a batch size of k, m classes, and a sequence length of n.
In this case, the target should contain a label for each time step of the sequence n and thus should have a shape of [k, n].

Sorry, there was a mistake by me previously. I updated my question. Actually shape of logits = [k,m,num_classes ]

In that case you would have to permute the output and make sure the dimensions correspond to:

Can you please provide me some lines of code how to do that? I am new to GNN and its first time I am dealing with 3d data.

Sure, here is an example:

batch_size = 2
nb_classes = 3
seq_len = 4
logits = torch.randn(batch_size, seq_len, nb_classes, requires_grad=True)
targets = torch.randint(0, nb_classes-1, (batch_size, seq_len))

criterion = nn.CrossEntropyLoss()
logits = logits.permute(0, 2, 1)
# torch.Size([2, 3, 4]) # corresponds to [batch_size, nb_classes, seq_len]

loss = criterion(logits, targets)
1 Like

For random labels generation, this is good. If my label vector is [0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,] (size = k = 25), then how to convert that static vector to required shape of (batch_size, seq_len)?

Before trying to convert the target to another format, you should think about what your output represents and how the corresponding targets look like.
As previously described, your model output indicates a temporal dimension (or in a more general sense: an additional dimension). Let’s call it temporal dimension for the sake of this example:
your model thus contains logits for each of the classes for each sample and each timestep.
The corresponding target should thus contain class indices for each sample in the current batch and for each timestep.
If your target only contains labels for each sample in the batch, your model output is most likely wrong.

What I want to do is that I want to predict label for each node in the graph, it is a node level classification problem. There are k = 25 nodes so size of labels is also 25. Each node contains m=2000 samples and the dimensions of each sample is n=6. It looks like model is containing label for each sample instead of only for each node. Is it possible to modify GCN model so that it returns labels for each node? In that case there will no need to change shape of albels.

Yes, that’s correct. The output tensor from the GCN model contains logits for each of the classes for each observation at each node in the graph, across a temporal dimension. In other words, it produces a sequence of logits over time for each node.

Need to modify the model so that it produces logits for each class at each node in the graph, rather than for each class at each observation (sample) at each node. But I have no idea how to do that and modify the model in such a way.

I’m not familiar with your model and GCNs in general, so don’t know which part of the model should be changed. However, a simple approach would be to reduce the temporal dimension, e.g. via torch.mean, or to use only the last “time step” of the sequence of predictions.