Local "slowdown" with Neural Style Transfer Tutorial

Hello, I’m trying my first steps with the official Pytorch NST tutorial. Colab works great, also moving to Gatys’ layers, with Adam and l1 loss.
At the moment I try to run it locally on Jupyter/Win10/GTX1650TI4GB it freezes at “Optimizing…” without any error. run[0] increases in the closure loop, if I print it.
Any ideas what happens here? Did maybe anyone encouter that b4 - not understanding why that happens atm… Thanks in advance!

Ok, my bad, it actually IS running. But it runs really slowly. Is my card that bad? A memory shortage (it uses only 2.5GB of the 4…)? Any other read-write management to advise?
Thanks for any suggestions! BR Andy

Without seeing the code that you use, it is hard to diagnose what is wrong.

1 Like

Hi, thank You for looking at. Basically the original code from the Pytorch NST Tutorial runs arond 5 times slower locally, colab+GPU vs. Local 1650TI, and I don’t get why.
The official code is here Neural Transfer Using PyTorch — PyTorch Tutorials 1.9.0+cu102 documentation
Thanks, br

Did you try printing out device to verify that your local code is actually running on the GPU?

1 Like

Yes, checked, cuda true, device active number, and name of this number, checkedbin system manager that card’s memory gets used (2.5GB of four). A google search for performance says - if I read it with “points” correctly - 1650ti should be 3-4 times faster than the tesla on colab…

You could try finding the step that is taking the most time, using code similar to the following:

from time import time

start_time = time()
# Code suspected of being slow goes here  
print('{} elapsed time {}s'.format(time() - start_time))

Once you narrow the slowdown to a specific line, you can investigate it better.

1 Like

Hello, I had a lot other tasks, so it took quite a while. Thanks again for looking into it.
Having measured it, and cleaned up as much as possible, it seems not as bad as it was.
Thnx again. BR.

Timings, fYi, on the same task (10000xNST, 512x512 image)
Tesla K80 Colab: 3524 s (will be dependent on shared usage …)
GTX1650TI 4 GB Notebook: 2230 s
GTX1080TI 11 GB as eGPU over Thunderbolt 3: 580 s

1 Like