I am trying to use TransformerConv to train a model which takes in graphs of different sizes. This is to be used on a graph classification task. My code snippet is as follows:

```
self.conv1 = TransformerConv(in_channels=-1, out_channels=256)
def forward(self, x_onein, x_twoin, lambd):
x_one, edge_index_one, batch_one = x_onein.x, x_onein.edge_index, x_onein.batch
x_one = self.conv1(x_one, edge_index_one)
return x_one
```

My input data come in batches of various sizes, and each graph has between tens of nodes to 1200 nodes. Because of this, I require the first input layer to be able to take in graphs of different sizes.

When I run the code, I get the error:

```
RuntimeError: Trying to create tensor with negative dimension -1: [256, -1]
```

This is strange since I followed the documentation:

**in_channels** (*int* *or* *tuple*) – Size of each input sample, or `-1`

to derive the size from the first input(s) to the forward method. A tuple corresponds to the sizes of source and target dimensionalities.

Any advice on this matter would be greatly appreciated.