I am designing an network in which I am passing an vector and Image as Input.
I wonder how can I design such network and loss.
Approach 1 (Not preffered): Flatten the Image then concat with that vector and send it to linear layer. But this won’t capture any features of the input image.
Approach 2: Can I design something combination of Convolution and linear at the input level itself? If so How can I do that.
Approach 3 : Have Two different networks one for convolution and other for the latent vector and combine them at later. But I fear this might lead to loss not getting decreased
Any Help or Opinion or Suggestion would be great.
I am referring to this below post for designing it