How to use Dataset class to load a multimodal dataset?


I am new to pytorch, in my research i have a multimodal task. Where i have to load face images and its respective landmarks as input to the model. Is there any resources i can refer to achieve this using Dataset class of pytorch?

The Data loading tutorial shows how to load face images with their corresponding landmarks.

Hi @ptrblck ,
Thank you, my task is multimodal, so inputs will be image and landmarks (which is X in ML terms) and out will be the yaw, pitch and roll (y vector). I did find way to make it work, by using dictionaries in the getitem method. Is it the standard approach for multimodal datasets?

Yes, dicts are often used for multiple outputs as also seen in the linked tutorial.

Thank you @ptrblck i appreciate your help!