Why is autograd on by default?

Sorry if this is an FAQ or completely irrelevant in 2022, but I’m wondering why autograd is enabled by default. Wouldn’t it be better to have it off by default, and only enabled when needed? In that case, any time you called backward() you would get a very clear error message telling you what you need to do. Yes a lot of people would hit it, but there would be essentially zero confusion on the issue, and people would learn to do the right thing very fast.

Whereas today I feel like I’m always chasing down memory leaks caused by autograd being enabled when I don’t want it. When anybody accidentally leaves autograd on during an evaluation, their workstation runs out of memory and becomes unresponsive needing a hard reboot. In the big code bases we work in, it seems like the only way I can be sure there aren’t memory leaks is to turn off autograd when the program first loads, and explicitly turn it on when I need it.

I realize it would be a big breaking change at this point. But I’m wondering if there’s good reasons I don’t see for having it be on by default? Because I see genuine pain this causes today, and it really seems the other option would be much easier to manage.

Cheers