How to build a multi modals models in PyTorch?

Hello everyone can anyone please tell me how to build a How to build a multi modal models in PyTorch from scratch?

Hi Teddy

You may want to checkout TorchMultiModal.

If you’re wanting to combine an image and text for example, you can process these separately, and eventually join the two tensors (image and text) together using something like torch.cat(). We can’t help you too much further without more information.

1 Like

okay, thank you i will inform you later.