for example:
This behavior would yield to wrong gradients and will thus raise an error.
What’s your use case that you want to recalculate the gradients using stale activation values?
for example:
This behavior would yield to wrong gradients and will thus raise an error.
What’s your use case that you want to recalculate the gradients using stale activation values?