The .cuda() call returns a non-leaf variable, which won’t be optimized. @albanD explained it very well in this post.
.cuda()