Did you make sure to apply the same transformations on the input images as well as the segmentation masks?
To do so I would recommend to use the functional transformation API, as it allows you to reuse the same “random” parameters from each transformation as described here.