Hey! Following this post, you would need to detach your tensor on the gpu so that it becomes a leaf node in the compute graph. Gradients can only be computed wrt. a leaf node in the graph.
Here is the gist of the modified code. Hope it helps