Thanks for the clarification. I missed this output.
The layer itself is never called, but its parameters are used in F.multi_head_attention_forward in these lines on code, which is why the hook isn’t called.
oh, I see from the implementation, thanks. For this particular case, I wonder why it is done that way, but I guess it was the design choice. Is there a way to detect this case (there is a child module, but never called in forward) systematically? Or should it be a case-by-case thing?
I don’t know why this approach was chosen and I don’t know if there is another way of checking, if the forward method was called besides what you’ve already did: using hooks and checking their output.