Hi!
I am new to PyTorch and I have one task: my objective is to upload the personally collected data to the PyTorch. I am working with the PyTorch Geometric library extension.
So I have some problems with understanding the following code:
import os.path as osp
import torch
from torch_geometric.data import Dataset
class MyOwnDataset(Dataset):
def __init__(self, root, transform=None, pre_transform=None):
super(MyOwnDataset, self).__init__(root, transform, pre_transform)
@property
def raw_file_names(self): #Point 1
return ['some_file_1', 'some_file_2', ...]
@property
def processed_file_names(self): #Point 2
return ['data_1.pt', 'data_2.pt', ...]
def download(self): #Point 3
# Download to `self.raw_dir`.
def process(self): #Point 4
i = 0
for raw_path in self.raw_paths:
# Read data from `raw_path`.
data = Data(...)
if self.pre_filter is not None and not self.pre_filter(data):
continue
if self.pre_transform is not None:
data = self.pre_transform(data)
torch.save(data, osp.join(self.processed_dir, 'data_{}.pt'.format(i)))
i += 1
def len(self):
return len(self.processed_file_names)
def get(self, idx):
data = torch.load(osp.join(self.processed_dir, 'data_{}.pt'.format(idx)))
return data
I have marked the points I would like to discuss. So Point 1 and Point 2 are the names of the files I want to upload and then obtain.
But what exactly are we doing at the Point 3, Pont 4 and after them? I have been trying to understand it also by looking at the similar source codes, but I cannot get it.
Could somebody explain it to me in the pain words? With some examples it would be perfect.
Thank you in advance!
Regards