Two bias vectors in recurrent units?

Hi,

I just noticed that in recurrent nets, e.g. http://pytorch.org/docs/master/nn.html#gru there are two bias vectors. Is there a particular reason for this?

Hello,

I think this is to match CUDNN:

Best regards

Thomas

I see. That’s interesting. Don’t understand why CuDNN design it like this.