I currently have a vector store that stores my large embedding of images. The vector length of these embeddings is 512. However, I would like to use a different featurizer that outputs embeddings with a vector length of 768. I want to use these new embeddings for searching without having to migrate or remove my existing vector store.
As someone who is new to this field, I have been considering a few options. One option I have thought about is taking the first 512 values of the 768-dimensional embeddings and discarding the rest. However, I am concerned that this approach may result in a loss of quality. Another option I have considered is using a projection method, as shown below:
from torch import nn
dim_in = 1024
dim_out = 768
projector = nn.Linear(dim_in, dim_out, bias=False)
inputs = torch.rand(dim_in)
inputs.shape # 1024
outputs = projector(dim_out)
outputs.shape # 768
However, I have noticed that when we initialize the projector, we always use random parameters, which means that the results may not be consistent.
Is there any other approach I could take? What is the correct way to convert embeddings with a 768 vector length into a 512 vector length, or in general, to convert larger embeddings into smaller embeddings and vice versa, without significant loss of quality?