Hi
I have a large HeteroData like:
HeteroData(
mol={ x=[876, 384] },
gene={ x=[18211, 18211] },
(mol, perts, gene)={
edge_index=[2, 11181554],
edge_label=[11181554],
},
(gene, rev_perts, mol)={
edge_index=[2, 11181554],
edge_label=[11181554],
}
)
can someone guide me to create mini-batches for my training…
I tried the following…
a. split the data into train,val,set using RandomLinkSplit
b. created HGTLoader using train like so
train_loader = HGTLoader(
train,
# Sample 512 nodes per type and per iteration for 4 iterations
num_samples={key: [16] * 2 for key in train_data.node_types},
# Use a batch size of 128 for sampling training nodes of type paper
batch_size=16,
input_nodes=(‘mol’,None),
)
c. Then in train loop i used for t in train_loader:
but model throws error...
Input In [12], in EdgeDecoder.forward(self, z_dict, edge_label_index)
19 def forward(self, z_dict, edge_label_index):
20 row, col = edge_label_index
---> 21 z = torch.cat([z_dict['mol'][row], z_dict['gene'][col]], dim=-1)
23 z = self.lin1(z).relu()
24 z = self.lin2(z)
IndexError: index 859 is out of bounds for dimension 0 with size 32
quite obvious it is not finding the edge_label_index in the batch...
(how can I make sure I pass the correct label_indices for the current batch?)
QUESTIONS:
a. am I using HGT Loader correctly or is HGTLoader is the right one to use ?
b. am I iterating thru the batches correctly ?
@ptrblck et al