How to process variable length sequence of images with CNN

The Idea behind this is to avoid useless convolutions on padded frames right? How do you ensure the convolutions are done correctly on the packed_images.data?