How to train attention network?

mmdbrdrn · June 23, 2022, 4:39pm

Hello,
I am trying to add a spatial attention mechanism to my semantic segmentation network (e.g., U-net), however, I have two questions?

I want to add attention to multiple layers, should I design one network and just resample the output wrt the dimensions of that specific layer? or should I design different attention networks for each layer?
should I train attention network(s) at the same time I train the main network? or should they be trained sequebtially?

Thank you so much