I want to use a unet-vgg16 model with pre-traing weights on ImageNet in order to perform a semantic segmentation task on Cityscapes dataset. Since the input image in the Cityscapes dataset is 1024*2048, I was wondering if there is a way that I use the pretrained model and expand it to match the Cityscapes images?
P.S. I’m also thinking of using something like a sliding window approach to run unet-vgg16 pre-trained model on the fly on the Cityscapes images, patch by patch, and then construct the final segmentation image for each image. However, I don’t know how to do it.
I need help, and I really do appreciate your help in advance.