I custom PyTorch word language model (my code https://github.com/ttpro1995/custom_word_language_model)
I add convolution layer before LSTM. (see file class ModelWrapper in https://github.com/ttpro1995/custom_word_language_model/blob/master/model/model_wrapper.py)
In main.py, I freeze the encoder (embedding) and train convolution, lstm, decoder
for p in model.conv_module.parameters():
p.data.add_(-lr, p.grad.data)
for p in model.rnn.parameters():
p.data.add_(-lr, p.grad.data)
for p in model.decoder.parameters():
p.data.add_(-lr, p.grad.data)
I got test ppl 3.70, which is too small.
Run command
python main.py --cuda --emsize 300 --nhid 168 --dropout 0.5 --epochs 40 --noglove
log file https://gist.github.com/e7644ad05836b6a147cb243e3764ff1f
Please tell me if anything go wrong.