Gradients through sampling operation?

(Dutta Abhigyan) #1

I was wanting to know whether gradients are propagated through sampling layer. For example let’s say I have defined my own new custom Dropout layer where I can define the probability of the dropout for each node in a layer. The sampling operation is done through an external function. So wil lPyTorch raise an error or is everything ok?

NOTE: I have defined the dropout probabilities as a function of weights from previous layer.

(Arul) #2

As long as you handle passing back the gradients correctly to the appropriate connections in the custom layer (in your case, a new dropout layer), pytorch will have no issues AFAIK.

(Dutta Abhigyan) #3

Hello! So is there any way to check whether gradients are being passed or not? Also, what I meant through gradient passing through sampling layer is that my dropout layer has different probability for each nodes and the probabilities are trainable parameters. Check the “Adaptive Dropout” by Ba et al for details, but in summary we use special trainable weights and inputs from previous layer to create dropout probabilities for the successive layer. Of course this is done by going through a sampling operation.

Now, I want to know whether the gradients are flowing through this dropout mask and training the weights with which we create the dropout probabilities?

NOTE: I do not really know whether what I am talking is mathematically not possible, but as far as what I want is that when 1 is sampled then I would like to backpropagate through these special weights of this dropout layer. You know something similar to ReLu, if positive backpropagate else 0 gradient.