Hey guys,
I’m trying to window word embeddings as a 3-gram. I know it is possible to do this with a convolutional layer, so I’m trying to implement it.
# (batch_size, sentence_size, embedding_size)
x = Variable(torch.rand(10, 5, 64))
conv = nn.Conv1d(64, 192, 3, padding=1)
y_hat = conv(x.permute(0, 2, 1)).permute(0, 2, 1)
Is this even remotely correct with respect to what I want to do?