I’ve been trying to make an RNN that predicts the next character in a string. (I’m trying to do it without the built-in models because I thought I would learn more that way.)
Unfortunately, it seems the model is only able to learn the rough distribution of letters; It hasn’t even realised that you shouldn’t put numbers right next to letters. It is as if it can’t see the input or the hidden layers. Given that it has ~400k trainable parameters I think it should be doing better.
Things I’ve checked:
- The input is getting through (checked with the print statement commented out below)
- The hidden layers are being passed on
Their may also be a problem with my understanding of Python as I am not very experienced with this programming language.
Here’s the model definition, other files are in this zip because I didn’t want to make this post overly long. I’ve replaced the loading of training data with just a couple hard-coded strings because the data-set is quite large.
class RNN(nn.Module): def __init__(self, input_size, hidden_size, output_size): super(RNN, self).__init__() self.hidden_size = hidden_size self.in_to_h = nn.Linear(input_size + hidden_size, hidden_size).to(device) self.h1_to_h2 = nn.Linear(hidden_size + hidden_size, hidden_size).to(device) self.h2_to_h3 = nn.Linear(hidden_size + hidden_size, hidden_size).to(device) self.h_to_out = nn.Linear(hidden_size, output_size).to(device) self.softmax = nn.LogSoftmax(dim=1).to(device) self.act = nn.Tanh() def forward(self, input, hidden): #Print input #print(all_letters[torch.argmax(input, 1).cpu().numpy()]) hidden = self.act(self.in_to_h(torch.cat((input, hidden), 1))) hidden = self.act(self.h1_to_h2(torch.cat((hidden, hidden), 1))) hidden = self.act(self.h2_to_h3(torch.cat((hidden, hidden), 1))) output = self.h_to_out(hidden) output = self.softmax(output) return output, hidden def initHidden(self): hidden =  for i in range(3): hidden.append(Variable(torch.zeros(1, self.hidden_size, device = device))) return hidden