## Problem

I am trying to reproduce the results shown in this tutorial. However, using `torch.nn.RNN`

(unlike in the tutorial, the author used the RNN he wrote by himself) outputs **score and loss are both constantly 0**.

## Goal

Classify names into their origins (languages) using RNN.

## Data Preprocessing

I have a bunch of text files where each text file includes many names corresponding to the same language (label). I one-hot encoded all the names and form a dataset and each entry looks like `((L, D), 1)`

, where `L`

is the number of characters in the name, `D`

is the dimension of one-hot representation and 1 corresponds to class label.

In my case, `D`

is 57 and there are 18 classes. So for name like “Mona”, the corresponding data is of shape `((4, 57), 1)`

.

## Model and Training Loop

```
class RNNNameClassifier(nn.Module):
def __init__(self, input_size, hidden_size, output_size, **kwargs):
super(RNNNameClassifier, self).__init__()
self.hidden_size = hidden_size
self.hidden = self.init_hidden()
self.rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, **kwargs)
self.classifier = nn.Linear(hidden_size, output_size)
def init_hidden(self):
return torch.zeros(1, 1, self.hidden_size)
def forward(self, embedding):
output, self.hidden = self.rnn(embedding, self.hidden)
output = self.classifier(output)
output = F.log_softmax(output, dim=1)
return output
```

```
model = RNNNameClassifier(input_size=EMBEDDING_DIM, hidden_size=HIDDEN_DIM, output_size=OUTPUT_DIM)
optimizer = optim.Adam(model.parameters(), lr=LR)
criterion = nn.NLLLoss()
```

```
loss_list = list()
for epoch in range(MAX_EPOCH):
print("epoch: %d / %d" % (epoch + 1, MAX_EPOCH))
for i, (X_train, y_train) in enumerate(train_dataset):
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.int64)
optimizer.zero_grad()
model.hidden = model.init_hidden()
score = model(X_train.view(X_train.size(0), 1, -1))
loss = criterion(input=score[-1], target=torch.tensor([y_train]))
loss.backward()
optimizer.step()
loss_list.append(loss.detach().cpu().numpy())
if (i + 1) % PRINT_FREQ == 0:
print("\tloss: %.5f" % loss_list[-1])
```

## Question

I am not sure the issue arises from some errors in my implementation or something else. Specifically, one potential problem might be I **should not** use one-hot encoding. Maybe some fine-tuning of word2vec is required.

Do I understand it correctly? Thank you in advance!