How to make my models agnostic to FX

Alexander_Soare · July 14, 2021, 12:55pm

I’m working on an FX transform (or set of transforms) that I intend to be able to work with a whole library of CNN and transformer based models for the purpose of feature pyramid extraction.

As such, I want to minimize (so, a bit relaxed vs totally avoid) any overhead in changing the existing models, and ensuring future models are symbolically traceable.

I’m aware of the limitations of symbolic tracing and just wanted to see if there are ways I can get around them without touching the models (much).

For instance:

Tensor constructors with dynamic args. Many models might expect a fixed input size, so I tried using theconcrete_args argument but I found that this “concreteness” of the input gets lost after it passes through a non-custom module.
If I can get concrete_args working, could I just rerun the trace with multiple input sizes (whatever I might expect for a particular model) and thereby build up the various dynamic flow control paths that I would expect into the final IR graph?
Could I very specifically customize the tracer class for each model to just do whatever I need it to do at a particular problematic node?

These are examples but overall my questions are: What types of things can I do to get around symbolic tracing limitations without modifying my models? Is it worth it or would I be better off putting more responsibility on the models? Will these answers change dramatically in the near future given developments to the FX library?