Does anyone know any useful tutorial for Transformers in vision?
Thank you. I mean the usage of “attention is all you need” paper in vision. such as sequential images or image captioning
Does anyone know any useful tutorial for Transformers in vision?
Thank you. I mean the usage of “attention is all you need” paper in vision. such as sequential images or image captioning