Hi,
Feature_matrix = torch.Size([44, 156])
Adjacency_matrix = torch.Size([44, 44])
Btw, i checked all the input and the outputs, Adjacency matrix wasn’t on cuda:0 device, so i assigned it and it worked. My current problem, why does my computation is slower than my cpu, even though i have tried various batch sizes.
Please see my training step
for epoch in range(num_epoch):
model.train()
for (g, features) in train_data_loader:
adj = g.adjacency_matrix(transpose = False)
adj = sp.coo_matrix(adj.to_dense())
n_nodes, feat_dim = features.shape
nodes = list(g.nodes())
# Against Class Imbalance
adj_norm = preprocess_graph(adj)
adj_label = adj + sp.eye(adj.shape[0])
adj_label = torch.FloatTensor(adj_label.toarray()).to(device)
pos_weight = float(adj.shape[0] * adj.shape[0] - adj.sum()) / adj.sum()
pos_weight = torch.from_numpy(np.array((pos_weight)))
norm = adj.shape[0] * adj.shape[0] / float((adj.shape[0] * adj.shape[0] - adj.sum()) * 2)
print(features.device, adj_norm.to(device).device, adj_label.device)
recovered, mu, logvar = model(features, adj_norm.to(device))
loss_train = loss_function(recovered, adj_label, mu, logvar, n_nodes, norm, pos_weight)
loss = loss_train
optimizer.zero_grad()
loss.backward()
cur_loss = loss.item()
optimizer.step()
Additionally, all inputs are on cuda:0 but cpu takes all the computation. Am i missing anything?
Thank you