Cpp_extension load for distributed environment

shaibagon · August 24, 2021, 11:03am

Hi,
I am running in a DataDistributedParallel environment, where each worker imports a JIT module and therefore calls torch.utils.cpp_extension.load(... ). As a result, each worker compiles the JIT module from scratch. This take ages. Is there a way to efficiently cache this loading?