Is there an elegant way to get response variables separately from predictor variables out of a object?

Looking at the documentation and source code for, I’m not seeing a natural way to extract response variables (e.g. labels / targets, like per-example class labels) other than via __getitem__(), which also extracts the predictor variables. I think this is a problem with the interface failing to separate concerns, but perhaps there’s a solution.

I’m researching open set recognition, so I often need to partition datasets “by class”, which means determining which class each data point belongs to. I’ve recently been working with a subset of ImageNet 1K, and I’m often working on a server where the file system is networked, so it’s extremely expensive to load images from disk, and it’s difficult to justify doing so when I don’t actually need the images, but rather just the class labels.

My current solution has been to build individual wrapper classes for each torchvision.datasets.XXX dataset class that I’ve been using to expose a function for extracting just the labels, often having to refer to the corresponding dataset’s source code to see how it stores and loads labels. But this a) depends on implementation details and thus can break from any torchvision update, and b) requires a lot of extra work.

Is there an easier, more universal way to extract just response variables, annotations, etc. from a dataset without also extracting predictor variables? Or is this impossible given the current nature of the interface?