[youtube tutorial] Seqtoseq in pytorch, char-level, trains in ~60 seconds

(Hugh Perkins) #1

Overview of concept of seq-to-seq. Assumes you know rnn already. Run the char-level training, on a few enlish-french sentences. See trains ok-ish. Brief overview of corresponding code. Go over the challenges I encountered with trying to make a simple rnn model that trains in 30-60 seconds.

Slides: https://docs.google.com/presentation/d/1z9INuS1VX2UigL3WqB60oJCMi_CJuaVRi-Qaun6fPl4/edit?usp=sharing
Source-code: https://github.com/hughperkins/pub-prototyping/blob/df9cf0c9fa473517956c55c33435924a289ddd36/papers/attention/seq2seq_noattention_trainbyparts.py
Experiment log: https://github.com/hughperkins/pub-prototyping/blob/df9cf0c9fa473517956c55c33435924a289ddd36/papers/attention/mt_by_align_translate.log
"Sequence to Sequence Learning with Neural Networks", by Sutskever, Vinyals, Quoc, 2014 https://arxiv.org/abs/1409.3215

(Hugh Perkins) #2

Few videos that upgrade this code bit by bit:

_1. upgrade to use idiomatic pytorch Modules, rather than kind of inline function stuff:

_2. upgrade to use timestep batching, in the encoder:

_3. upgrade to use minibatches, in both encoder and decoder:

Links to the relevant source-code are included in the youtube descriptions, but they’re basically different commits of : https://github.com/hughperkins/pub-prototyping/blob/master/papers/attention/seq2seq_noattention_trainbyparts.py