How to use Dataset class to load a multimodal dataset?

Menan · February 9, 2023, 7:58am

Hi!

I am new to pytorch, in my research i have a multimodal task. Where i have to load face images and its respective landmarks as input to the model. Is there any resources i can refer to achieve this using Dataset class of pytorch?

ptrblck · February 14, 2023, 5:37am

The Data loading tutorial shows how to load face images with their corresponding landmarks.

Menan · February 14, 2023, 8:12am

Hi @ptrblck ,
Thank you, my task is multimodal, so inputs will be image and landmarks (which is X in ML terms) and out will be the yaw, pitch and roll (y vector). I did find way to make it work, by using dictionaries in the getitem method. Is it the standard approach for multimodal datasets?

ptrblck · February 14, 2023, 8:30am

Yes, dicts are often used for multiple outputs as also seen in the linked tutorial.

Menan · February 19, 2023, 6:49am

Thank you @ptrblck i appreciate your help!