What is an efficient way to do a forward pass of a batch of data for an Graph Neural Network (GNN)?

I wanted to do a forward pass for a GNN (or Tree NN). I know that for vision one can process the entire batch in one go by having the batch be a dimension in the tensor. Is something like this possible for GNNs (Graph Neural Networks)? I am worried that having to do a forward pass for each example for my GNN might slow things down, but perhaps there is no other way.

I assume multithreading might be possible but I don’t know how that would work with multiple threads/processing using the same gpu…plus debugging sounds like a nightmare. But I am curious what is the standard practice for this since GNN research exists.

these might be worth looking into:

though I don’t know which one is the standard pytorch or recommended way to process graph data (or implement their forward passes). This feels like shouldn’t be something I need to implement from scratch.

related: python - What is the standard way to batch and do a forward pass through tree structured data (ASTs) in pytorch so to leverage the power of GPUs? - Stack Overflow