I found the following error that I am getting on Github, but it doesn’t seem to have been solved. This is the traceback:
Traceback (most recent call last):
File "/path/to/run.py", line 380, in <module>
loss = train()
File "/path/to/run.py", line 64, in train
out = model(data).view(-1)
File "/n/scratch3/users/v/vym1/nn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/path/to/WLGNN.py", line 114, in forward
xs += [torch.tanh(conv(xs[-1], edge_index, edge_weight))]
File "/n/scratch3/users/v/vym1/nn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/n/scratch3/users/v/vym1/nn/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py", line 169, in forward
size=None)
File "/n/scratch3/users/v/vym1/nn/lib/python3.7/site-packages/torch_geometric/nn/conv/message_passing.py", line 236, in propagate
out = self.message(**msg_kwargs)
File "/n/scratch3/users/v/vym1/nn/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py", line 177, in message
return edge_weight.view(-1, 1) * x_j
RuntimeError: CUDA out of memory. Tried to allocate 3.37 GiB (GPU 0; 11.17 GiB total capacity; 5.35 GiB already allocated; 1.46 GiB free; 9.29 GiB reserved in total by PyTorch)
I have 100GB of memory allocated, and it isn’t clear to me why PyTorch can’t allocate it when it has only allocated a small fraction of the memory in total.