How to create reusable framework to visualize internal model calculations

We want to visualize internal model calculations, e.g. attention maps, in a research project with multiple different models. The goal is to create a common framework for these visualizations with the following constraints:

  • they should only be calculated and stored during validation to save computation time during training
  • the interface should be common for all models (i.e. PyTorch modules with a forward function)
  • it should be possible to automatically visualize the generated images in tensorboard
  • it should work (e.g. correct naming of generated images) with modules calling other submodules

I am wondering what is the best way to implement this (i.e. the most usable and generic way). Basically I had the idea of passing a dictionary to the module into which the module can write, depending on it’s current mode.

Did you already encounter such a problem? If yes how did you implement this?


I don’t know, if this would exactly fit your use case, but have a look at Captum, which provides a variety or methods for model interpretability.

CC @Narine_Kokhlikyan

Thank you for the reply!

I was more asking for general device on how to design such a framework. I would be very happy, if you could provide some ideas on how you wold design this?

Thanks, Jan