Caching parameters, and randomly using one of them for computing gradients

Here’s a hacky class defn I put up together to solve a similar issue https://gist.github.com/SsnL/f2f56534aefb22d8612dbd7a5da28ed8

2 Likes