wgpubs
(WG)
1
I can’t figure out how to properly setup a field object for multi-label classification with torchtext. Here is what I have in my dataset class:
… where lbl
is a OHE numpy array (e.g., [0, 1, 0 ,0, 1, 1, 0]
)
My torchtext field object is defined like this:
tt_LABEL = data.Field(sequential=False, use_vocab=False)
But when I try to package everything up into a BucketIterator
and get a mini-batch, I get the following exception:
only length-1 arrays can be converted to Python scalars
There error is on line 294 of field.py:
294 arr = [numericalization_func(x) for x in arr]
4 Likes
hiromi
(Hiromi)
2
@wgpubs, did you ever find a work around?
wgpubs
(WG)
3
Hey @hiromi! I remember ya from fastai.
Here is code I’m using for the toxic comp. Appreciate any feedback and the good and ugly of it and what can be improved. Hope this helps:
2 Likes
hiromi
(Hiromi)
4
Awesome! Thanks for the example!! You’re way ahead of me
hiromi
(Hiromi)
5
I’m currently trying to see if I can get data.TabularDataset
to work kind of like this one:
My brain is too tired to keep going tonight, but I will get back to it tomorrow.
hiromi
(Hiromi)
6
@wgpubs, I’ve tried many things, and your implementation is the best and cleanest!!!