Hello everyone can anyone please tell me how to build a How to build a multi modal models in PyTorch from scratch?
You may want to checkout TorchMultiModal.
If you’re wanting to combine an image and text for example, you can process these separately, and eventually join the two tensors (image and text) together using something like
torch.cat(). We can’t help you too much further without more information.
okay, thank you i will inform you later.