Will it be faster if I convert the jpg datasets into pt datasets?

You wouldn’t have to profile the entire dataset, but just a subset estimating the expected performance. E.g. while storing tensors could save some processing in e.g. resizing you should consider if you want to use static resizing during training or a random crop followed by resizing. Also, the size difference might also be large comparing JPEG encoded images against raw tensors as described here.