Getting the "some of the strides of a numpy error are negative" on using Facebook Infersent model

Anirudh257 · December 11, 2018, 7:57am

I am new to PyTorch and NLP. I want to encode a list of sentences for an NLP task and I am following this demo: Demo link.

I successfully load the sentences and build the vocabulary for 100000 most frequent words but when I try encoding the sentences using

embeddings = model.encode(sentences, bsize=128, tokenize=False, verbose=True)

I get the error:

ValueError: some of the strides of a given numpy array are negative. This is currently not supported, but will be added in future releases.

I tried changing the sentences but it doesn’t work?
What can be the possible error?

albanD · December 11, 2018, 10:35am

Hi,

You should use torch.from_numpy() to convert numpy arrays to Tensor before giving them to pytorch’s function to improve performances.

The error you see most certainly comes from the fact that not all numpy arrays can be represented as Tensor (arrays that were flipped in particular). You can use np.ascontiguousarray() before giving your array to pytorch to make sure it will work.

Anirudh257 · December 11, 2018, 12:16pm

Thanks for the reply @albanD. I tried your suggestions but I am unable to convert a list of strings to a tensor.

So, I converted the list to a one-hot encoded list and then convert it to a contiguous array. However, the function encode defined in Infersent is taking only a list of strings. I get this error now:

TypeError: split() missing 1 required positional argument: ‘split_size’

albanD · December 11, 2018, 12:18pm

Hi,

Tensors cannot contain strings. You would usually use an Embedding layer ton convert a string to some learnable features that represent that string. And then use these features in your model.

Anirudh257 · December 11, 2018, 1:11pm

Okay, let me try that and get back.