Slow Training of Segmentation Model

Dear PyTorch Community

I’m trying to implement this segmentation model into PyTorch code base:

Problem: it’s taking around 6 hours to train a single epoch on Mapillary Vistas from scratch (on a single GTX 1080). Is this normal?

My feeling is that it shouldn’t take this long as other models (e.g. u-net) take under an hour (about 15min). It also seems to lag on sending tensors to GPU if I use normal BN. If I use InPlaceABN instead of lagging on sending to GPU it does so on the backward call. Either way, it also seems to lag on printing the loss or using .item() on loss.

Can anyone suggest where it might be going wrong or how I could figure this kind of problem out? I’ve tried using Mapillary’s SingleGPU class but it didn’t seem to help…

Depends what are the size of the images or the length of your epoch. But in semantic segmentation is not rare for an epoch to take few hours.

Also, this question is best suited for that repository, I believe has little to do with pytorch.