With the help of collate_fn i’ve created variable input length for each batch.(same input length in single batch but different in another)
I want to create the sentiment classifier with the custom embedding.
import torch
import torch.nn as nn
class WordEmbedding(nn.Module):
def __init__(self, vocab_size, embed_size):
super(WordEmbedding, self).__init__()
self.vocab_size = vocab_size
self.embed_size = embed_size
self.embedding = nn.Embedding(self.vocab_size, self.embed_size, sparse=True)
self.fc1 = nn.Linear(self.embed_size*4767, 256)
self.dropout = nn.Dropout(p=0.3)
self.fc2 = nn.Linear(256, 1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = self.embedding(x).view((x.size(0), -1))
out = self.fc1(x)
out = self.fc2(self.dropout(out))
out = self.sigmoid(out)
return out
Since, for each batch, there is different length(not equal to above vocab_size), there is matrix mis match error.
I was wondering if there is any method i can train my model for different batch input length considering above model architecture.?