Hub search path?

markdjwilliams · April 1, 2021, 10:50pm

i all,

I’m part of a larger team of researchers and I’m trying to establish how we can share a repository of pre-trained model weights. In general, we do not have unrestricted internet access.

As things currently stand the pytorch hub checks ones of two possible locations for model weights - either $TORCH_HOME/hub/checkpoints or the path provided to an earlier call to torch.hub.set_dir().

What I’m hoping to achieve is a system whereby “standard” models (such as those provided by torchvision) can be manually downloaded once into an area (which will generally be read-only for most users), and torch.hub.load_state_dict_from_url() should read them from that location when present. Additionally, individual developers may want to install various models from third-party sources and have those stored in their personal local user area (which has read/write permissions) for the purposes of experimentation.

Currently each researcher has to make their own copy of the shared model weights as we cannot isolate models shared among researchers from their local sandbox.

One possible solution would be a colon-separated search path, configurable via environment variable, which would be searched sequentially for cached weights. Downloaded weights would be saved into the first such path which has the appropriate access rights.

Has anybody looked into setting this up this way, or are there any plans to introduce such a mechanism into pytorch core in the future? As things stand it looks like I’d have to re-implement parts of torch.hub myself, but I imagine others may have run into similar requirements at some point.

Thanks,
Mark

ptrblck · April 2, 2021, 6:02am

I don’t think this feature is currently implemented, but sounds like an interesting use case.
Would you mind creating a feature request with a link to this topic on GitHub?
Also, would you be interested to work on such a feature of providing multiple (user defined) locations for the hub search strategy?

markdjwilliams · April 14, 2021, 8:30pm

Thank you, and apologies for the slow reply. I’ll file the feature request. I doubt I’ll be able to set aside the time to implement this properly myself, but if things change then I will certainly have a go.