You would always see the periodic peak in the data loading, if the time of the training loop is less than the data loading time for the next batch.
Yes, the transformations would cost some time, but they are not necessarily the bottleneck of the data loading pipeline and you would have to profile it.
E.g. loading the data could be slow in case you are using an HDD instead of an SSD etc.
This post explains potential bottlenecks and some workarounds.