Hello, everyone. I am freshman to image captioning, but I find it very interesting.But there is a problem puzzled me.Is there some relationships between image captionging and object detection.In other words, I wanna ask if the object detected can be applied to image captioning?
@Michael_Hsu: Hello, I am not sure if this is the right forum to ask this question since it is mainly for Pytorch related questions. But I would be glad to suggest some references that you may find interesting related to captioning and detection:
- https://cs.stanford.edu/people/karpathy/sfmltalk.pdf
- https://www.youtube.com/watch?v=yk6XDFm3J2c
- https://medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e
- https://www.coursera.org/lecture/convolutional-neural-networks/object-detection-VgyWR
I hope this helps!
Do help me a lot! Thanks for your generious contribution!