I don’t exactly understand what would be a good reason not to use mmap
. It rather sounds to me that I just always should set this to True?
mmap loads the information from disk when you slice the tensor (or you apply it in a relevant operation)
It’s good or bad depending on the context.
Good if bounded by memory (you can lazily initiallize them). Obviously, you cannot overwrite its values.
Bad if you have memory to host the whole data you need.
You mean, I cannot use load_state_dict(..., assign=True)
and train that model afterwards on CPU, as the values would be readonly this way? I did not consider this.
With the default (assign=False
), this is not a problem though, right?
And also, if they are then to be moved to GPU anyway (model.to("cuda")
), this also would not be a problem?
And also, if I did torch.load(..., map_location="cuda", mmap=True)
, this would also not apply?
Why? Mmap would also load them into memory once they are needed.