About the ways to get the length of a dataser

Hi , All

I am not sure about the best way ofgetting the length of a dataset,

Shall I do, dataset.size().value() or sizeof(dataset)

Thanks.

If the dataset is a numpy array or tensor then u can simply use: dataset.shape.
It’ll return a tuple with the shapes of the dataset at respective axis/dimensions, the 1st value of the tuple is the length of the dataset.

To get only the length of the dataset, u can use dataset.shape[0].

1 Like

Assuming you would like to get the length of torch.utils.data.Dataset then print(len(dataset)) should work:

class MyDataset(Dataset):
    def __init__(self):
        self.data = torch.randn(100, 1)
        
    def __getitem__(self, index):
        x = self.data[index]
        return x
    
    def __len__(self):
        return len(self.data)

dataset = MyDataset()
print(len(dataset))
> 100

Thanks a lot. Much appreciated.