# Pre-calculated distance matrix in Dataset

I’m working on a graph neural network involving 3d positional vectors R as nodes and the scaler distances D as edges, and I want to take derivatives of the network output with respect to the initial position R. Here is a minimum working example not involving any neural network.

``````import torch

R = torch.rand(10, 5, 3, requires_grad=True)  # Position vectors, (n_data, n_point, 3)
D = torch.norm(R.unsqueeze(2) - R.unsqueeze(1), dim=-1)  # Distances matrix, (n_data, n_point, n_point)
print(gradient)  # Returns tensor with shape (10, 5, 3)
``````

Since the distance matrix D does not change throughout the training, I want to pre-calculate the distance matrix when I create the PyTorch Dataset, rather than to re-calculate it every time I load a batch. Somehow the graph structure in PyTorch doesn’t allow me to do that, presumably because the batched R of the Dataset output is different from the original R used in the graph of D.

``````import torch
from torch.utils.data import Dataset

class BatchDataset(Dataset):
def __init__(self, R):
self.R = torch.tensor(R, requires_grad=True)  # Position vectors, (n_data, n_point, 3)
self.D = torch.norm(self.R.unsqueeze(2) - self.R.unsqueeze(1), dim=-1)  # Distances matrix, (n_data, n_point, n_point)
def __getitem__(self, index):
return self.R[index], self.D[index]
def __len__(self):
return self.R.size(0)

R = torch.rand(10, 5, 3, requires_grad=True)  # Position vectors, (n_data, n_point, 3)
batch_R, batch_D = BatchDataset(R)[:]