I need some help to catch up on these… I didn’t recall seeing these params in the original paper and apparently they’re in the pytorch code. Could someone enlighten me where they’re originated and/or why we need them? Thank you,
1 Like