Hi,
I am working on visualizing the attention layers of the deit_tiny_patch16_224
model. To achieve this, I registered forward and backward hooks on the attn_drop
layer using register_forward_hook
and register_full_backward_hook
. However, when I run the model, the hooks for the attn_drop
layer are not being triggered.
Interestingly, if I register the same hooks on a different layer, such as k_norm
, they work as expected.
I suspect this issue might be related to the dropout probability being set to 0.0 in the attn_drop
layer. From my understanding, the dropout probability shouldn’t affect the execution of hooks, but I might be overlooking something. Could the disabled dropout (p=0.0) be causing the hooks to behave unexpectedly, or is there another reason the hooks aren’t working for this layer?