I create custom data with torchvision datasets and I have a pickle file that I want to load.
The size of the pickle file is 79600x1x30 (batch, class, len of the vector).
I work with the getitem function, I tried to divide this pickle file, but it didn’t work.
There is some function that slices the pickle file or takes a batch from him?
I assume you are loading a tensor or e.g. numpy array from the pickle
file?
If so, what kind of error are you seeing in the __getitem__
by indexing this object?
The size of my pickle file is 19700x4
I’m trying to work with 32-batch, but it does not divide the size of the pickle (row), so it takes all the rows, e.g: (32, 19700, 4). and it needs to be (32,4).
I don’t know where I’m wrong I think I need to define the sampler.
code:
class ROboDataset(Dataset):
def init(self, root, path, train=True, transform=None):
self.root = root
self.path = path
self.train = train
self.transform = transform
def __getitem__(self, index):
root = self.root
path = self.path
data = pickle.load(file=open(os.path.join(root, path), "rb"))
features = data[0]
target = data[1]
if self.transform is not None:
features = self.transform(features)
return features, target
def __len__(self):
data = pickle.load(file=open(os.path.join(self.root, self.path), "rb"))
return len(data[0])
In your code snippet you are not using the index
, but load the complete pickle
file and split it into the features
and target
tensors. I assume you could load the file in the __init__
and index it in the __getitem__
using the passed index
value.
Yes, I tried not to load it on init, but I see there is no other option.
Thanks.