The Problem
Debugging torch.compile is painful. When compilation is slow or produces unexpected results, users face cryptic logs, scattered information, and no clear path forward. Graph breaks and recompilations are especially hard to diagnose.
Proposed Solution
A lightweight diagnostic tool that automatically captures compilation events (graph breaks, recompilations, timing) and presents them in a structured, actionable format—either as a summary printout or an interactive HTML report.
The goal is to answer: “Why did my model compile this way, and how do I fix it?”
Questions for Maintainers
-
Is there interest in this kind of tooling?
-
Should this live in
torch._dynamo.diagnosticsor elsewhere? -
Are there existing efforts I should align with?
Happy to share a detailed implementation plan if there’s interest. This would be my first PyTorch contribution.