Torchvision's inception_v3 takes much longer to load than other models

If I try to load resnet50 after downloading the pretrained weights, with the code

start = time.time()
resnet = torchvision.models.resnet50(pretrained=True)
print('resnet50 loaded in {} seconds'.format(time.time()-start))

I get

resnet50 loaded in 0.41980624198913574 seconds

and similar times for vgg, densenet, etc. However, for inception_v3, I get

inception_v3 loaded in 184.83905911445618 seconds

Is there a reason inception takes so much longer to load?

4 Likes

We had recently the same issue here.
Do you see the same loading time when you use pretrained=False?

Yes, the times are identical with and without pretrained. I’m using the latest PyPI versions of torch (1.4.0) and torchvision (0.5.0), and it happens on multiple machines.

I did a tiny bit of digging, and the slowdown is 99% due to a single line in Inception3’s initialization:

values = torch.as_tensor(X.rvs(m.weight.numel()), dtype=m.weight.dtype)

Specifically, getting the truncated normal distribution samples via X.rvs(m.weight.numel()). Could it be that a change to scipy.stats slowed this down? Also, is it possible to perform the same truncated norm with torch’s built-in tensor operations?

1 Like

Do you also see the long operating time when you call this single line of code in isolation?

If I change the weight initialization to something like

        for m in self.modules():
            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
                import scipy.stats as stats
                stddev = m.stddev if hasattr(m, 'stddev') else 0.1
                X = stats.truncnorm(-2, 2, scale=stddev)
                #values = torch.as_tensor(X.rvs(m.weight.numel()), dtype=m.weight.dtype)
                #values = values.view(m.weight.size())
                foo = X.rvs(m.weight.numel())
                values = torch.zeros_like(m.weight.size())
                with torch.no_grad():
                    m.weight.copy_(values)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)

Then I still see the slowdown, which disappears if I remove the X.rvs() call. A single call to X.rvs() doesn’t take a particularly long time, but the loop iterates over ~300 layers.

1 Like

Thanks for this information!
Could you post the scipy version so that I can reproduce it?
Also, thanks for narrowing down this issue. :slight_smile:

I’m using the latest PyPI version of scipy, 1.4.1.

It might be related to this scipy.stats issue.
CC @fmassa

1 Like

It definitely seems like that scipy.stats issue is the same issue we’re seeing here. In the meantime, I’ve just removed the weight initialization, since I always use the pretrained weights anyways.

2 Likes

I am also experiencing the same issue in a tensorflow docker container. @ptrblck you can reproduce this issue by the following 3 steps:

  1. docker pull tensorflow/tensorflow:2.1.0-gpu-py3
  2. create a docker container of the image, do pip install torch torchvision inside container.
  3. model = torchvision.models.inception_v3()

I think tensorflow is irrelevant for this issue, but the version of other packages in the container will be useful for you to find the root cause.

The root cause should be isolated by @daveboat and linked to in my last post pointing to a performance regression in rvs calls in scipy.

2 Likes

Uninstalling scipy version 1.4 and installing 1.3.3 fixed the issue for me. The loading time is much faster now. Just had to do pip install --upgrade scipy==1.3.3

13 Likes

@jitesh Thanks! That works.

No problem. Glad I could help!

I use scipy=1.2.0 to solve this.

Yeah, I guess that works as well

Thanks @jitesh @ptrblck for engaging in this discussion. What a relief.