Hey there,
I keep hitting this warning, followed by a long set of compiler messages:
[2023-07-10 15:24:52,962] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: 'forward' (/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py:452) reasons: ___check_obj_id(L['self'], 139679724322528) to diagnose recompilation issues, see https://pytorch.org/docs/master/compile/troubleshooting.html.
I checked the pytorch docs:
https://pytorch.org/docs/stable/dynamo/troubleshooting.html#excessive-recompilation
and saw this:
torch._dynamo.config.cache_size_limit = <your desired cache limit>
Is there any guidance for how to choose the cache limit? Also, not sure how to set this attribute as it’s unable to reference _dynamo directly (?)
Not sure if it matters, but the model being trained was a t5-base
via huggingface’s trainer on an NVIDIA A6000 Ada card (48GB). The installation is straight from this docker-image: nvcr.io/nvidia/pytorch:23.06-py3
Thanks!