Are the pixels expected to be in the 0-1 range? 0-255 range? Something else?
ConvNext all reuse
ImageClassification as seen here which accepts a
PIL.Image and will scale it to
[0, 1] first and then normalize it as described here so I assume the input can be a pure
[0, 255] assuming you are using the predefined transformation.
Fantastic, thank you. I had seen your first link, but didn’t mentally parse the
partial call over to ImageClassification. Much appreciated!
Let me know if this works of of you are seeing unexpected results as I’ve just checked the source code without a verification.
My use-case is applying the ConvNext encoder for segmentation. I have been providing input in the [-1, 1] range and it works quite well, but now that you pointed me to the normalization constants I’ll try the usual
scale to [0, 1] and then normalize approach and let you know if I get better results.
It will take quite awhile before I can comment about the final metrics, but the training and validation loss are now decreasing faster than they were before. So it seems that identifying the right normalization range has been useful. Thanks again.