Fastfood random projection

I’m trying to implement a random projection using the Fastfood algorithm (Le, Quoc, Tamás Sarlós, and Alex Smola. “Fastfood-approximating kernel expansions in loglinear time.”).

Essentially, what I am trying to do is implicitly multiply a vector v by a random square Gaussian matrix M, whose side is equal to a power of two. The matrix is factorized into multiple matrices: M = HGΠHB, where B is a random diagonal matrix with entries ±1 with equal probability, H is a Hadamard matrix, Π is a random permutation matrix, and G is a random diagonal matrix with independent standard normal entries. Multiplication by a Hadamard matrix can be done via the Fast Walsh-Hadamard Transform O(d log d) time and takes no additional space. The other matrices have linear time and space complexities.

Crucial to my code is that the gradients propagate through all this sequence of operations. Does somebody have an idea on how to implement this? I tried to look around, but could not find any Pytorch implementation of Fastfood neither of Fast Walsh-Hadamard Transform.