cvclpl
(cc)
May 3, 2022, 10:21am
1
Hello,
I create a dataset object with the purpose to use map. However, when I try to call the function map, I get the error that the object has no attribute map.
Any ideas how to fix this?
class MyDataset(Dataset):
def __init__(self, text, tags):
self.sentence = text
self.tags_per_token = tags
def __getitem__(self, idx):
sentence = self.sentence[idx]
tags_per_token = self.tags_per_token[idx]
#return one_text, one_label
return {"text_words": one_text,"tags":tags_per_token}
def __len__(self):
return len(self.tags_per_token)
dataset_train = MyDataset(train_data, train_tags)
dataset_train.map(
my_function,
batched=True)
AttributeError: 'MyDataset' object has no attribute 'map'
This seems to be the same issue as described here (unfortunately without a follow up).
Could you describe where you’ve seen the .map
method applied on torch.utils.data.Dataset
as it’s not a built-in method?
cvclpl
(cc)
May 3, 2022, 1:24pm
3
Hi @ptrblck
Thanks for your reply - I was following the huggingface dataset preparation in
Perhaps a huggingface dataset is different to a pytorch dataset.
Yes, this seems to be the case so you might want to ask in the HF discussion board.
cvclpl
(cc)
May 3, 2022, 9:40pm
5
Thanks @ptrblck ! that cleared out the confusion
specteross
(Prasoon Varshney)
October 24, 2022, 2:38am
6
@cvclpl @ptrblck Seeing the exact same issue. How did you solve it?
nivek
(Kevin T)
October 24, 2022, 5:38pm
7
PyTorch’s built-in Dataset
doesn’t supports .map()
as an operation. If you would like that feature, please use DataPipe from TorchData
.
Please note that the built-in PyTorch Dataset
is not the same as the one provided by Hugging Face.