I can understand the logic, but I wonder if this is the best option. My intentions are for debugging and visualizing the computations (e.g. Attention units).
My model contains large, not constant, number of modules, so the suggested solution won’t be very elegant for my needs. Maybe a different solution exists, like a wrapper module that saves the computations or hacks?