Dynamo cache limit and too many graph breaks

Previously I had gotten torch.compile inductor max-autotune to work with good results. When I recently tried again to produce my results I kept hitting the cache size limit of 64. Is this related to too many “graph breaks”?

I found a way to debug graph breaks by having it dump details about them. One of them that I get, in large numbers, is in a case where it complains about “torch.float16” being returned from a function. NOTE: This isn’t a value of this type but the “type” itself.

What might it take to get _dynamo/variables/builder.py:wrap_fx_proxy_cls() to be able to handle a simple constant which is just a type?

Also, I believe there seems to be a lot of graph breaks due to “call_function”. I don’t understand thing enough probably to ask the right question other than to ask if it could be supported.

Check out torch._dynamo.explain() to diagnose graph breaks PyTorch 2.0 Troubleshooting — PyTorch master documentation

If you have a repro I can take a look