Hello,
I have 5 images I’m trying to get the best representation of all 5 images down into one single representation. I need the representation to be in B * C * H * W. I’m using this as a hidden state for a diffusion model. Any suggestions???
Hello,
I have 5 images I’m trying to get the best representation of all 5 images down into one single representation. I need the representation to be in B * C * H * W. I’m using this as a hidden state for a diffusion model. Any suggestions???
Try Averaging
images = [image1, image2, image3, image4, image5]
averaged_image = torch.mean(torch.stack(images), dim=0)
This should give u B * C * H * W