Custom dataset getitem return label as integer or tensor, return single data or a range of data?

The custom dataset will return image in tensor and its label. My questions are:

  1. What is the data format of label class? If return label as a tensor, which one is correct:
class_id = torch.tensor(class_id) --->dataloader return label size of [batch]
class_id = torch.tensor([class_id])--->dataloader return label size of [batch, 1],here 1 is dimension of label
  1. Can getitem method return a range of data points? I know dataset[0] return first element, but is dataset[2:10] feasible in custom dataset and dataloader? If feasible, how?

Can anyone help me if possible please? Thanks a lot! I look forward to any reply!!

  1. I’m not sure it is expected that the label (if it is a class id) would be transformed to a tensor; see the canonical ImageFolder implementation here:
    torchvision.datasets.folder — Torchvision main documentation

If I remember correctly [batch] should be fine and is expected for standard loss functions such as cross entropy loss: CrossEntropyLoss — PyTorch 1.10 documentation (see docs saying labels should have shape (N)).

  1. Yes, potentially if __getitem__ implemented slicing e.g., python - Implementing slicing in __getitem__ - Stack Overflow. But for common use cases slicing is not needed or desirable since the typical approach is to rely on dataloaders — PyTorch 1.10 documentation to automatically batch (and shuffle) your data.