You could profile the code and check the bottlenecks.
Once you see where the bottleneck is (e.g. data loading, processing, model forward/backward), you could try to check why the bottleneck is in this particular part.
E.g. if your data loading is too slow, make sure to load the data from a local SSD and use multiple workers in a DataLoader
.
On the other hand, your model might have inefficient code by e.g. using for loops, which might be vectorized.
It’s hard to tell how to optimize something, as the bottleneck might come from a lot of different parts.