Issue trying to index pandas dataframe via tensor scalar

I apologize if this doesnt belong here. I am facing a weird issue trying to index a row of my pandas dataframe in the __getitem__ method of the Dataset class. In the code shown below, the line self.df.iloc[index] raises the error, also shown below. I printed the values of index, index.numpy() and len(self.df) to verify that I am not exceeding the bounds. They give a value of tensor(640) 640 649. I tried indexing the dataframe with my own value which worked so I am not sure what the error is. Maybe I am not properly converting from a tensor to whatever pandas accept ? I also posted this on the stackoverflow forums since I am not sure where the error lies…

  • I am not exceeding the bounds
  • I can index using my own value (eg. 640)
  • I still cannot index it if I convert from tensor to numpy

Error message

Traceback (most recent call last):   File "train.py", line 532, in <module>
    main(args)   File "train.py", line 203, in main
    for batch in train_loader:   File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 336, in __next__
    return self._process_next_batch(batch)   File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
    raise batch.exc_type(batch.exc_msg) TypeError: Traceback (most recent call last):   File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])   File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])   File "/home/HaziqBinRazali03/Trajectory Prediction/sgan-original/sgan/data/trajectories.py", line 204, in
__getitem__
    df = self.df.iloc[b]   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 1478, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 2091, in _getitem_axis
    return self._get_list_axis(key, axis=axis)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 2070, in _get_list_axis
    return self.obj._take(key, axis=axis)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 2789, in _take
    verify=True)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4537, in take
    new_labels = self.axes[axis].take(indexer)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexes/base.py", line 2195, in take
    return self._shallow_copy(taken)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexes/range.py", line 267, in _shallow_copy
    return self._int64index._shallow_copy(values, **kwargs)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexes/numeric.py", line 68, in _shallow_copy
    return self._shallow_copy_with_infer(values=values, **kwargs)   File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexes/base.py", line 538, in _shallow_copy_with_infer
    if not len(values) and 'dtype' not in kwargs: TypeError: object of type 'numpy.int64' has no len()

The code snippet

def getitem(self, index):

    print(index, index.numpy(), len(self.df))

    # get the row at index
    df = self.df.iloc[index] # doesnt work
    df = self.df.iloc[index.numpy()] # doesnt work
    df = self.df.iloc[640] # works

    # load the images and the label
    pedestrian_images = []
    pedestrian_label  = df["label"]
    for full_filename in df["full_filename"]:
        pedestrian_images.append(self.transform(Image.open(full_filename)))
    pedestrian_images = torch.stack(pedestrian_images, 0)

    return [pedestrian_images, pedestrian_label]

Part of the pandas dataframe

     unique_id  label                                      full_filename
0          112      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
1          606      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
2          327      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
3          385      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
4          736      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
5          634      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
6          534      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
7           61      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
8           40      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
9          124      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
10         165      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
11          97      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
12         559      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
13          98      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
14         190      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
15         360      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
16         478      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
17         737      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
18         633      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
19         362      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
20          95      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
21         375      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
22         301      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
23         729      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
24         240      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
25         312      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
26         537      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
27         786      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
28         585      0  [/home/HaziqBinRazali03/Trajectory Prediction/...
29         443      0  [/home/HaziqBinRazali03/Trajectory Prediction/...

Which PyTorch version are you using?
This dummy code seems to work:

class MyDataset(Dataset):
    def __init__(self, df):
        self.df = df
        
    def __getitem__(self, index):
        x = self.df.iloc[index].values
        x = torch.from_numpy(x).float()
        return x
    
    def __len__(self):
        return len(self.df)

df = pd.DataFrame(np.random.randn(100, 2))
dataset = MyDataset(df)
loader = DataLoader(
    dataset,
    batch_size=2)

next(iter(loader))
1 Like

That code works for me as well. I believe it had something to do with the WeightedRandomSampler from my previous post DataLoader - using SubsetRandomSampler and WeightedRandomSampler at the same time. The variable index of the __getitem__ method is a tensor for the code when I make use of the WeightedRandomSampler. Thank you.

1 Like

When I use WeightedRandomSampler, I also met this question. Can you tell me how to solve it ?

You can use tensor.data e.g.

def __getitem__(self, index):

    index.data # convert to scalar


Here is my getitem function, do you mean that the solution is to change “index” to “index.data” in the function?

Yes. I believe index is a tensor if you are using WeightedRandomSampler. You can always print out index to be sure. It should therefore be X = self.read_images(index.data)

Thanks for your help. I have solved the question.