Convolutional NN for text input

I am trying to implement a text classification model using CNN. As far as I know, for text data, we should use 1d Convolution. I saw an example in pytorch using Conv2d but I want to know how can I apply Conv1d for text? Or, it is actually not possible?

Here is my model scenario:

Number of in-channels: 1, Number of out-channels: 128
Kernel size : 3 (only want to consider trigrams)
Batch size : 16

So, I will provide tensors of shape, <16, 1, 28, 300> where 28 is the length of a sentence. I want to use Conv1d which will give me 128 feature maps of length 26 (as I am considering trigrams).

I am not sure, how to define nn.Conv1d() for this setting. I can use Conv2d but want to know is it possible to achieve the same using Conv1d?

1 Like

You can always view(1, -1) the tensor though!

I didn’t understand your answer. Please explain briefly.

1 Like

have you looked at the example of Conv1d in the documentation? It will help you figure it out.
Conv1d takes 3d inputs (mini-batch x Channels x width).

2 Likes

Here’s an example of Conv1d and Pool1d layers into an RNN: https://gist.github.com/spro/c87cc706625b8a54e604fb1024106556

4 Likes

Thanks, the following worked for me. Your example helped me. I didn’t know that the embedding dimension can be used as the number of in-channels.

m = nn.Conv1d(200, 10, 2)
input = Variable(torch.randn(10, 200, 5))
feature_maps1 = m(input)
print(feature_maps1.size())

I have one question. Why the kernel size can be a tuple in Conv1d? Even the stride, padding, dilation everything can be tuple. How they can be tuple for Conv1d? Can you provide an example?

@smth, yes I saw that example previously. At that time, I was assuming that for text data, number of channels should be 1. I never thought the way @spro used Conv1d in his provided example!

hi , did you find any good resource to understand 1d convolution for text ? I mean I am not able to understand by having the embedding dimention as input channel makes sense ? Also in your example the 5 word sequence (each 200 dimension ) i.e 1000 vector for each sentence is converted into 40 after the process right ?

1 Like