Attributes to be considered while using model interpretability methods

kountaydwivedi · July 29, 2021, 10:53am

Hello all,
I am using model interpretability methods such as IntegratedGradients, GradSHAP, DeepLift, etc. After executing the attributions method, I get a matrix of attributions corresponding to every feature in the dataset. My question is:
Which attributes should I consider as important attributes? The ones with high positive correlation, or the ones with negative correlation, or both? Maybe I want to ask how to assign the attribution values to important or not-important set?

Thank you.

googlebot · July 29, 2021, 4:16pm

Both. Check plots in captum tutorials to get an intuition. Not sure if there is an universal recipe for importance thresholds…

Jugal_Patel · August 17, 2023, 5:12pm

New model-agnostic interpretability methods, both local and global, are being released by Leap Labs in a few weeks (https://www.leap-labs.com/). There is a waiting list on their site. Their global methods extract what a model has learned directly from the model, independent of the training data. Their local explanations use hierarchical perturbation.