I get Scatter is not differentiable twice when trying to backward the gradient penalty suggested in the WGAN paper. I’m running the latest version of pytorch and the models are using DataParallel!
Certainly! The discriminator takes two inputs and combines concatenates their channel dimension. It is a sequence of Convolutions with a Linear layer at the end.
There has been a similar question here:
I’m also running my models in DataParallel and can explicitly post the code if need be.
The master is the current actively developed branch. So it is newer than 0.2.0. The patch is after 0.2.0 so it is only in master right now. Sorry for letting you have to build from source to solve this.
I have a very similar problem, in my case I use an RNN and then I get
"CudnnRNNLegacyBackward is not differentiable twice"
I would modify things myself just to make it be differentiable the way I want (if needed). How can I overcome this?
Any help would be awesome. I use GPU, Cuda 8, and did not compile from scratch. (I do not use nn.parallel)
I appreciate any help!
Nevermind, upon second thought, I realized non-legacy RNN modules don’t have double backward either. Btw, I wasn’t talking about compiling from source. The non-legacy RNN modules should be in your release, assuming it is not too old.
After reading a little and trying running it w/o cuda is does seem to backprop twice on a cpu. I suspect there is a bug there due to experiencing a different behavior so I reported on it pytorch github issues. I hope they would be able to address it