Can anyone tell me what mistake I’m making here? My error message is:
22 # Load data and get label
23 X = torch.sparse.FloatTensor(
---> 24 torch.LongTensor([self.i[ID]]),
25 torch.FloatTensor(self.v[ID]),
26 torch.Size([self.vecSize])
RuntimeError: sizes must be non-negative
I did of course check that none of the values in self.i.[ID] are negative…
This results from calling my dataGenerator at Epoch [1/5], Step [33/2814] :
for local_batch, local_labels in training_generator:
#a batch training step
Below is definition of my dataset, the main challenge of which is converting from PySpark sparse vectors to tensors.
from torch.utils import data
class Dataset(data.Dataset):
def __init__(self, list_IDs, labels, indices,values,vecSize):
#all inputs except list_IDs are dictionaries keyed to ID
'Initialization'
self.labels = labels
self.list_IDs = list_IDs
self.i= indices
self.v =values
self.vecSize=vecSize
def __len__(self):
return len(self.list_IDs)
def __getitem__(self, index):
# Select sample
ID = self.list_IDs[index]
# Load data and get label
X = torch.sparse.FloatTensor(
torch.LongTensor([self.i[ID]]),
torch.FloatTensor(self.v[ID]),
torch.Size([self.vecSize])
).to_dense()
i = self.labels[ID]
return X, torch.LongTensor([i])
training_set = Dataset(partition['train'], labels_train,i_train,v_train,inputSize)
training_generator = data.DataLoader(training_set, **params)
Thanks rasbt. Your explanation makes sense. In the line it’s complaining about, I’m not setting any sizes except for the size of the sparse vector, which is always ~8000. It’s set once and none of the other data loaders are upset. So how do I interpret this in the context of my code?
X = torch.sparse.FloatTensor(
torch.LongTensor([self.i[ID]]),
torch.FloatTensor(self.v[ID]),
torch.Size([self.vecSize])
).to_dense()
i = self.labels[ID]
Oh that could happen in my data. Somewhat unlikely (that none of at least 100 words in a reddit post were in vocabulary), but definitely possible. I’ll look for that.
Is there a good way to find out what datum is causing the issue? (Other than knowing the problem and searching input) I haven’t found many examples using dataloaders… – I think I’ve answered my own question and realized that I really should be writing in a better debugging environment, but if you have other suggestions I’d love to hear them.
While I know I should use debuggers more often, I would simply try to print the last datum since it should still be in memory if you are in an interactive environment.
Alt. you could implement a try-except condition and print the info in the except part if the error occurs
E.g., sth like
try:
...
except RuntimeError:
print the necessary info
@rasbt Huh… yeah I got a try/except into the data loader and found an empty index list. That was it. Thank you so much for the unsticking. Grad school and finals are hard and help is very much appreciated.