Torchtext error: ValueError: invalid literal for int() with base 10: 'Sentiment'


(Jinu Daniel) #1

I was trying to create a classifier model using the Amazon fine food reviews dataset that can be obtained from Kaggle. But the training of the LSTM model is failing with the below error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-f8a71793cefa> in <module>
      1 training(model=model, epoch=2, eval_every=1000,
      2          loss_func=loss_function, optimizer=optimizer, train_iter=train_iter,
----> 3         val_iter=val_iter)

<ipython-input-39-fe1648a5a8c7> in training(epoch, model, eval_every, loss_func, optimizer, train_iter, val_iter, early_stop, warmup_epoch)
     11     for e in range(epoch):
     12         train_iter.init_epoch()
---> 13         for train_batch in iter(train_iter):
     14             step += 1
     15             model.train()

/opt/anaconda3/lib/python3.7/site-packages/torchtext/data/iterator.py in __iter__(self)
    155                     else:
    156                         minibatch.sort(key=self.sort_key, reverse=True)
--> 157                 yield Batch(minibatch, self.dataset, self.device)
    158             if not self.repeat:
    159                 return

/opt/anaconda3/lib/python3.7/site-packages/torchtext/data/batch.py in __init__(self, data, dataset, device)
     32                 if field is not None:
     33                     batch = [getattr(x, name) for x in data]
---> 34                     setattr(self, name, field.process(batch, device=device))
     35 
     36     @classmethod

/opt/anaconda3/lib/python3.7/site-packages/torchtext/data/field.py in process(self, batch, device)
    199         """
    200         padded = self.pad(batch)
--> 201         tensor = self.numericalize(padded, device=device)
    202         return tensor
    203 

/opt/anaconda3/lib/python3.7/site-packages/torchtext/data/field.py in numericalize(self, arr, device)
    317             if not self.sequential:
    318                 arr = [numericalization_func(x) if isinstance(x, six.string_types)
--> 319                        else x for x in arr]
    320             if self.postprocessing is not None:
    321                 arr = self.postprocessing(arr, None)

/opt/anaconda3/lib/python3.7/site-packages/torchtext/data/field.py in <listcomp>(.0)
    317             if not self.sequential:
    318                 arr = [numericalization_func(x) if isinstance(x, six.string_types)
--> 319                        else x for x in arr]
    320             if self.postprocessing is not None:
    321                 arr = self.postprocessing(arr, None)

ValueError: invalid literal for int() with base 10: 'Sentiment'

The code can be found at GitHub


Any help would be appreciated. Thanks.