Network custom connections (paired connections) - performance issue

Nope. I tried both approaches, the zeroing-out mask and a list/loop of small fully connected layers. The former did not fit into my GPU (I was not able to handle this with sparse matrices as there were not enough documentation of these) while the latter worked terribly slowly.
I created another topic asking for help with implementation Gradient masking in register_backward_hook for custom connectivity - efficient implementation, but I did not get any.