I’ve created the RNN class as :
import torch.nn as nn
from torch.autograd import Variable
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size , hidden_size)
self.h2o = nn.Linear(hidden_size, output_size)
self.h2h = nn.Linear(hidden_size, hidden_size)
self.Relu = nn.ReLU()
self.softmax = nn.Tanh()
def forward(self, input, hidden):
h = self.Relu(self.h2h(hidden)+ self.i2h(input))
o = self.softmax(self.h2o(h))
return o, h
def init_hidden(self):
return Variable(torch.zeros(1, self.hidden_size), requires_grad=True)
Then, I create the network as :
rnn = RNN(n_chars, 90, n_chars)
criterion = nn.MSELoss()
learning_rate = 0.05
optimizer = torch.optim.Adam(rnn.parameters(), lr = learning_rate)
hidden = rnn.init_hidden()
epochs = 5
Currently, the value of rnn.i2h.weigh.grad
is equal to None.
But when I train the network, after 2-3 iterations, all the values of rnn.i2h.weight.grad
become Nan. This makes training the network impossible.
Why is this happening?