Your idea sounds valid. If you suspect the model is focusing on the “wrong” feature, you could add negative samples as e.g. just pictures of snow without any bears.
Also, you could use e.g. Captum
to apply some visualization techniques, which might be helpful to figure out what the strongest features in the inputs are.