My model consist of convolution operations with skip connection. I trained model for 128x128 input dimension. During testing i give image of dimension 256x256. Results are differ in both the cases.
I think it’s expected to see different results if the spatial input size was changes, as e.g. the conv kernels were trained using a specific resolution.
In this post, it says it can be possible to use different dimension as input but still i confuse. Is the result will be same or different?
If the result is different, is it any trick like changing the trained conv kernels size (repeating value or something else) according to spatial input size during testing?
It’s possible to increase the spatial size if you are using e.g. adaptive pooling layers or make sure to create the expected flattened activation tensor before feeding it into the first linear layer (assuming you are working on a “standard” CNN containing a feature extractor and a classifier).
However, you cannot expect to get the same results.
I don’t know if there are any (well tested) techniques to manipulate the model to achieve a comparable performance using larger inputs. Fine-tuning should work however.
You mean to say flatten the input before feeding to first convolution layer? Can you please simplify.
(assuming you are working on a “standard” CNN containing a feature extractor and a classifier).
Is it means it can be useful for classification problem not regression where your output is also image?
No, I was referring to flattening the activation before feeding it to a linear layer, e.g. as seen in the ResNet example.
Yes, I think using linear layers at “the end” of the model is commonly used for classification/regression use cases where either class logits or floating point regression values should be created.
If you are trying to create an output image you might want to check model architectures such as e.g.
Thanks for clarification.