Using torch.compile twice on a model on the same machine, is there a cache of optimized operations?

LewsTherin · February 3, 2024, 12:00pm

I’m using torch.compile to compile a torch model by:

self.model = torch.load(saved_model_path, map_location=self.device).to(self.device)
self.model.eval()

self.model.half()

# Configure hidet to use tensor core and enable tuning
hidet.torch.dynamo_config.use_tensor_core(True)
hidet.torch.dynamo_config.search_space(2)
self.model = torch.compile(self.model, backend="hidet")

I ran this on a remote machine a first time. Lots of optimization operations were being skipped because they weren’t supported:

[2024-02-03 12:48:24,408] torch._dynamo.convert_frame: [WARNING]     raise NotImplementedError("\n".join(lines))
[2024-02-03 12:48:24,408] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='hidet' raised:
[2024-02-03 12:48:24,408] torch._dynamo.convert_frame: [WARNING] NotImplementedError: The following modules/functions are not supported by hidet yet:
[2024-02-03 12:48:24,408] torch._dynamo.convert_frame: [WARNING]   torch.nn.AvgPool3d

but many others were. After a while, the optimizations were done and inference was performed.

Then, I ran it again on the same machine, but I only saw the warnings regarding the skipped operations. Why is that? Is there some sort of cache, so that optimizations are not performed again for the same model?