I build myself a Unet accordingly to the original Unet paper with input 564x564 RGB and smaller output segmentation map: GitHub - CaipiDE/unet_mirror
I also implemented the mirroring strategy from the paper. So given the fact that I wanted to have a segmentation mask as big as my input, my mirror_extrapolate algorithm mirrors each side with 98px. As a result I now insert a 756x756 input (image with mirroring sides) which results in a 564x564 segmentation map, perfect
I testet it on the inria satellite imagary labeling dataset for building detection and was surprised how good it looked after just the first epoch!
But with increasing epochs, the map just literally disappears? The segmented objects have no sharp edges and are just some wobbely floating random points (not complete random but you get the point)…
Did anyone else maybe faced of this behaviour when they did unet or maybe its typicall for a mistake I made?
PS: here is again the Unet structure and my mirroring stuff how that works:
Hi, does the dice score keep increasing?
Also, I noticed the masks have thin edges rather than ‘blobs’
The center image is the mask while the right one is the prediction (I used your model). So, I believe it is drawing now only the edges (and thus looks like the prediction is disappeared ?).
Ground Truth are actually filled in my dataset:
Also the Dicescore is doing what ever the flip he wants… Further here is my loss despite being negative the curve does look too bad I guess:
Another Unet implementation was done by Aladdin Persson, might have heard of him on YT, he implemented Unet like this: Machine-Learning-Collection/ML/Pytorch/image_segmentation/semantic_segmentation_unet at master · aladdinpersson/Machine-Learning-Collection · GitHub
I firstly think that he cropped the tensors for the skip connection the wrong way (cropped those coming from the upconv not the skip ones) and he has not mirrowing… How does his net outperform mine this heavily?
Dice loss should not go below 0 usually. Need to verify the loss I think.
Loss is fixed now, I guess your not supposed to mix torchvision transforms with albumentation transforms, anyway, loss is not slightly positive but never decreases and I think following is the problem now:
Left side the input data 756x756, right side the prediction in first iteration 564x564, first image. The prediction is still mirrowed at all sides (top is the most visible)… But how is this possible thus my model is never resizing but always cropping and uses no padding? Of course the ground truth is not mirrowed but has the same size of the prediction…
EDIT Wait I am talking crap sorry, I am resizing in the model, that must be the solution for that… I will crop now and update you guys.
Fixed it, my mistake. Learn: Do not mix albumentations with torchvision transforms for whatever reason…