I believe it works in the way that each time you pass through an image there’s a 50% probability that it will be flipped horizontally. In other words a single image is sent in and mapped to a single image with varying outcomes (50% flipped, 50% normal).
That is per epoch. So if I have a epoch=1 and p=0.5, my model will see 50% of the images flipped and 50% non flipped. That would be considered transformations.
However, if I have more epochs I guess almost every image will be seen in their original and transformed form. So it would be an augmentation.
Augmentation would not increase the number of images i-e Case A is applicable …
the main benefit of augmentation here lies in the fact that for next epoch , a transformed image (maybe, based on probability ) is passed … hence model is practically trained on a new image