JoinTable like operation when combining CNNs

Before I get to my issue, just wanted to say thank you for PyTorch. I think the library is clean and code is easy to follow.

I’m trying to replicate a part of this model I think I doing the concat part wrong, where the output from three different CNNs are combined (Torch version uses JoinTable). Although I’m not getting any error, the loss.backward() takes forever to run. I really new to both Torch and PyTorch, any help would be much appreciated. Is there something wrong with the way I’ve defined my model ?

class Net(nn.Module):
def __init__(self):
    super(Net, self).__init__()
    self.embed = nn.Embedding(vocab.size(), 300)
    #self.embed.weight = Parameter( torch.from_numpy(vocab.get_weights().astype(np.float32)))        
    self.conv_3 = nn.Conv2d(1, 50, kernel_size=(3, 300),stride=(1,1))
    self.conv_4 = nn.Conv2d(1, 50, kernel_size=(4, 300),stride=(1,1))
    self.conv_5 = nn.Conv2d(1, 50, kernel_size=(5, 300),stride=(1,1))
    self.decoder = nn.Linear(50 * 3, len(labels))

def forward(self, x):
    e1 = self.embed(x)
    x = F.dropout(e1, p=0.2)
    x = e1.view(x.size()[0], 1, 50, 300)
    cnn_3 = F.relu(F.max_pool2d(self.conv_3(x), (maxlen - 3 + 1, 1)))
    cnn_4 = F.relu(F.max_pool2d(self.conv_4(x), (maxlen - 4 + 1, 1)))
    cnn_5 = F.relu(F.max_pool2d(self.conv_5(x), (maxlen - 5 + 1, 1)))
    x =[e.unsqueeze(0) for e in [cnn_3, cnn_4, cnn_5]]) 
    x = x.view(-1, 50 * 3)
    return F.log_softmax(self.decoder(F.dropout(x, p=0.2)))

I’d guess that it’s because you’re using 5x300 convolutional kernels, that are incredibly expensive to compute (7x7 kernels are considered large and expensive).
Apart from this I can’t see any mistakes at a glance. There might be some discrepancies with the original code, but I don’t know it so it’s hard to tell.