Hey, thank you for updating!
I see what you mean by my example not making sense. I just wanted to showcase the toggling of the requires_grad flag.
I am wondering why .detach()
is supported, but not requires_grad
. To me both fullfill orthogonal ways to influence gradient backpropagation. I guess it is just a practical limitation about how tracing works?
Why do you regard jit.trace
as being dangerous/limiting? I am learning a lot here. Btw if you have some ressource where I can read up on the practical use of JIT I would appreciate you pointing me there!