PyTorch Geometric -- how make own dataset of multiple graphs?

I have a list of multiple Data objects that are all independent graphs. How do I make a dataset out of this, so that it is like the built in datasets in the tutorial here? I have tried the tutorial on making your own dataset but I have absolutely no idea how to make sense of it (note I am experienced with PyTorch but not so much with custom data sets, usually they are not needed).

All I want to do is make this list of Data objects a dataset so that it has the same functionality as in the tutorial but this seems impossible to do ā€“ if someone could help I would be extremely grateful, thanks.

2 Likes

I am also looking for this answer. Most of the online sources only talk about the in-built dataset but to apply GNN we need to synthesize our own data. if you have found the answer do share. Thanks.
I think the following method from the PyG doc will work but it is not an efficient way to do::

from torch_geometric.data import Data
from torch_geometric.loader import DataLoader

data_list = [Data(...), ..., Data(...)]
loader = DataLoader(data_list, batch_size=32)

Here, each Data object should be created in a loop.

2 Likes

Checkout these two files:

I have created first done the preprocessing and stored .pt files in processed directory. Later, Iā€™m loading it via NetlistGraphDataset. It is flexible if one wants to create different train/test split on different graph objects.