Hi,
I’m currently trying to work on a prototype change to address a previously mentioned issue: How to support heterogeneous memories - hardware-backends - PyTorch Developer Mailing List
To do this, I am taking the approach mentioned in this post - trying to add an additional default argument to tensor constructors (e.g. zeros(), etc.). I can make the changes to the APIs, native_functions.yaml, etc. but I do end up hitting several issues.
I looked for some previous work, and found the commit adding the commit adding pin_memory to the TensorOptions, but it seems a fair amount has changed since then.
My approach so far has been to update all TensorFactories methods, update native_functions.yaml, update TensorOptions and torchgen code relating to parsing/building tensoroptions, update derivatives.yaml, and various other small changes to glue all changes together.
Some of the issues I have hit are:
- Some of the methods I would want to add this new argument to are listed in torchgen/aoti/fallback_ops.py. This results in torchgen/gen.py#L2424 being thrown, which is a bit confusing to me wrt. how to address it. Indeed, it seems by case would be (2), but I’m not too clear on what the listed steps actually entail (e.g. adding the shim, what/where is the version number to bump?, etc.)
- Additionally, if I bypass issue (1) by just omitting these methods for now, I still hit other issues at the code generation stage of build. For example, the gen_python_functions.py hitting an error where it finds the empty.out overload, but can’t find a matching empty function. This seems to be because it doesn’t expect empty.out to have my extra argument, and can only find empty with the extra argument. But I’m not sure why this is occurring.
Is there any advice or info on how to go about changing these fairly fundamental APIs? The code generation etc. seems to complicate such things a lot, even if it’s adding just a new default argument.
Please feel free to ask for any extra useful info, I wanted to try not to include too much extraneous detail for readability.