I want to use my loss for images for my video, what im doing is looping over the frames and then call the loss function and summing it up.
there is another way to do it?
this is what i’m doing:
def dist_per_frame(self, frames_real, frames_fake):
rec_loss = 0
for i in range(31):
rec_loss += self.dist_frame(frames_fake[i], frames_real[i])
**self.dist_frame is my loss function
Can you pose a pytorch-related problem? Do you have a model? Pure math question?
yes, i have a model and i’m using perceptual loss (VGG16). the VGG16 net works well woth images (for reconstruction) but when im using the net with video i want to take into account all the frames.
so i wonder if what i wrote above is the correct way to do so.
Proper way would be to squash all those frames into batch dimension, apply vgg, unsquash and then sum them up
what does it mean to squash them?
i pass a tensor of size [batch_size, 3, 64,64] to VGG16 so i think i squashes them up into a batch dimension (?).
As you are working with video I was expecting something like
batch,frames,3,64,64 --> batch*frames,3,64,64