Char-level RNN sampling

neuralpat · January 16, 2021, 9:05am

Hi,

I built a character-level RNN that generates headlines. It has clearly learned, as it gives me mostly correct words and headlines when I sample it.
My problem is though, that it mostly generates headlines from the training data.
I’m using np.random.choice() to sample the next character based on the probability distribution the network outputs.

Is this kind of network ever going to output truly novel headlines?
What could I do differently ?

Here’s the network:

class HeadlineModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(HeadlineModel, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.fc1 = nn.Linear(hidden_size, output_size)

    def forward(self, data, hidden, cell):
        x, (h, c) = self.lstm(data, (hidden, cell))
        x = self.fc1(x)
        y_hat = x  # torch.softmax(x, dim=2)

        return y_hat, h, c