Batch Renormalization implementation in THCUNN

Nabarun_Goswami · July 24, 2017, 2:24pm

Hi,

I have been trying to implement the batch renormalization. Implementing in python by creating a custom Module results in much higher memory usage and time consumption. So I created a BatchReNorm function in cuda, following the same pattern as BatchNorm in THCUNN.

I am able to successfully build and get the updates and the clipping working perfectly, however, I am not sure regarding the stop_gradient mentioned in the batch renorm paper.

I create the parameters r and d in the BatchReNormalizationUpdateOutput_kernel as below:
Acctype r = 0;
Acctype d = 0;
I then clamp the values using THCNumerics<Acctype>::lt and THCNumerics<Acctype>::gt

I do not pass them to the BatchReNormalizationBackward_kernel

Does that ensure that the gradients don’t flow through them?

Any insight would be really helpful.

Regards
Nabarun

dhpollack · January 29, 2018, 7:03pm

Did you ever solve this?

Nabarun_Goswami · January 30, 2018, 7:59am

Hi,

This approach works.

Regards
Nabarun

arc144 · March 19, 2019, 5:26pm

Could you make your implementation public available? It would help a lot folks like me who don’t know how to mess directly with cuda. Thanks!

catosphere · June 22, 2019, 6:25pm

Hello everyone
I am also interested in trying to change BatchNorm1d with an implementation of Batch Renormalization … does anyone have implemented it please ? @Nabarun_Goswami would you share yours please ?