Hi!
I am new to pytorch, in my research i have a multimodal task. Where i have to load face images and its respective landmarks as input to the model. Is there any resources i can refer to achieve this using Dataset class of pytorch?
Hi!
I am new to pytorch, in my research i have a multimodal task. Where i have to load face images and its respective landmarks as input to the model. Is there any resources i can refer to achieve this using Dataset class of pytorch?
Hi @ptrblck ,
Thank you, my task is multimodal, so inputs will be image and landmarks (which is X in ML terms) and out will be the yaw, pitch and roll (y vector). I did find way to make it work, by using dictionaries in the getitem method. Is it the standard approach for multimodal datasets?
Yes, dict
s are often used for multiple outputs as also seen in the linked tutorial.