About the torch.compile category
|
|
0
|
1098
|
January 9, 2023
|
A node type in export IR graph
|
|
1
|
6
|
November 28, 2024
|
Scaled_dot_product_attention higher head num cost much more memory
|
|
1
|
10
|
November 28, 2024
|
CUDA memory allocation for result tensor
|
|
0
|
7
|
November 26, 2024
|
Compile and vmap in custom op with quantile
|
|
0
|
14
|
November 25, 2024
|
Compiling vmapped custom op
|
|
5
|
19
|
November 25, 2024
|
Closures are being gc'd and causing failures to compile
|
|
1
|
16
|
November 24, 2024
|
Why does the inductor reduction Triton Codegen use the Welford algorithm instead of the Naive?
|
|
1
|
18
|
November 20, 2024
|
Image_process.postprocess slow after torch.compile
|
|
0
|
17
|
November 18, 2024
|
Compiling a method other than forward
|
|
2
|
21
|
November 19, 2024
|
Error module torchvision in CUDA 11.4
|
|
2
|
16
|
November 19, 2024
|
Increased memory footprint with custom kernel and all reduce
|
|
2
|
22
|
November 18, 2024
|
The forward graphs captured by torch.export and aot_export_module are different
|
|
2
|
31
|
November 17, 2024
|
Dynamic slicing torch.export
|
|
2
|
23
|
November 16, 2024
|
Discrepancies Between Compiled and Non-Compiled Models with Convolutional Layers in PyTorch
|
|
1
|
38
|
November 16, 2024
|
Any chance to preserve some ops while decomposing PT2E model?
|
|
1
|
13
|
November 16, 2024
|
Multiple compiled versions of the same model
|
|
2
|
33
|
November 16, 2024
|
Torch.compile - what is the best scope of compilation?
|
|
7
|
2235
|
November 16, 2024
|
Dynamo Trace with Parameter Lifting
|
|
1
|
22
|
November 16, 2024
|
AOTInductor autograd support
|
|
1
|
13
|
November 16, 2024
|
Is it possible to ignore part of the code for torch compile
|
|
4
|
31
|
November 16, 2024
|
Inconsistent Results with torch.compile on Identical Environments and GPUs
|
|
3
|
54
|
November 8, 2024
|
Torch compile with forward-mode automatic differentiation
|
|
2
|
15
|
November 6, 2024
|
Profiling torch.compile CUDA code
|
|
3
|
29
|
November 5, 2024
|
Handling LSTM states as model's inputs/outputs using fx.symbolic_trace
|
|
0
|
9
|
November 5, 2024
|
How to limit torch.compile to CPU only?
|
|
3
|
78
|
November 1, 2024
|
Does max-autotune is useful for A5000?
|
|
0
|
11
|
October 31, 2024
|
What caused matrix inputs to no longer be transposed in PyTorch 2.5?
|
|
0
|
14
|
October 29, 2024
|
Torch.fx.symbolic_trace with multiple GPUs
|
|
1
|
21
|
October 29, 2024
|
CUDA 12.6 and torch.cuda.is_available return false
|
|
7
|
533
|
October 28, 2024
|