[solved] Why we need to detach Variable which contains hidden representation?

I would say it depends on your use case and maybe your workflow.
E.g. how would you like to expose the functionality of:

  • backpropagating through all seen data (i.e. in PyTorch just don’t detach the hidden state)
  • use only the last input batch?

New proposals for these use cases (and UX) are always welcome. :slight_smile: