The recently released torch 2.0 works great, torch.compile() makes models to run faster. I have tried few experiments where I observed the first run is taking time for compilation and from the second run it speeds up the model.
What is the reason behind it? Even for the second run if we instantiate the model and run the model again from start it takes lesser time. Is torch.compile() stores something at backend?
If you are saying that “compiling” the model again takes less time, then I believe this is expected as the compiled kernels are cached: PT 2.0 - Are compiled models savable - #2 by smth.
Is there a way to access this cache, from where can we do this?
I would see if e.g., torch._dynamo — PyTorch 2.0 documentation does what you are looking for
The link above shows how to reset the cache, is there any way to save this cache and use it back by pickling?