Bug? scaled_dot_product_attention slower than manual multiplication?
|
|
1
|
3
|
August 15, 2025
|
Using DistributedDataParallel with dataloader num_workers > 0
|
|
2
|
3788
|
August 15, 2025
|
Make Image Classification faster
|
|
0
|
1
|
August 15, 2025
|
Performance of repeated torch::jit::load() calls
|
|
1
|
38
|
August 15, 2025
|
Avoidable error due to multiple links to libiomp5md.dll (during runtime by numpy and torch)
|
|
0
|
4
|
August 15, 2025
|
PPO with Categorical Action... help
|
|
11
|
51
|
August 15, 2025
|
CosTrader Env from scratch... and transform problem
|
|
3
|
13
|
August 15, 2025
|
Blace.ai - new C++ Inference SDK & Model Hub
|
|
0
|
9
|
August 14, 2025
|
Bug? RMSNorm slower than LayerNorm
|
|
2
|
28
|
August 14, 2025
|
Add more feature in last layer
|
|
3
|
35
|
August 14, 2025
|
Help with RTX 5090
|
|
11
|
2445
|
August 14, 2025
|
Performance regression: torch.jit.trace() significantly slower on RTX 5090 than RTX 4060 (cu128 nightly)
|
|
4
|
50
|
August 14, 2025
|
Installation via Conda
|
|
1
|
24
|
August 14, 2025
|
I successfully built PyTorch from source on Windows 11 using CUDA 12.9 for the RTX 5070. Still working through a small runtime issue, but the full build process is documented here on GitHub: [https://github.com/DMorford/TorchinDesiree]
|
|
1
|
74
|
August 13, 2025
|
vLLM + openai/gpt-oss-20b on 3× RTX 3090 (CUDA 12.8) — FlashAttention Error
|
|
0
|
17
|
August 13, 2025
|
Retrieve per-class gradients without create_graph=True and retain_graph=True
|
|
4
|
37
|
August 13, 2025
|
Why pytorch is getting killed during training on larger dataset on AWS EC2 instances
|
|
6
|
30
|
August 13, 2025
|
[BUG] why does the C++ Libtorch performance slower than pytorch? (show the full code)
|
|
0
|
15
|
August 13, 2025
|
Help getting a project that requires torch==2.6.0 torchvision==0.21.0 on Blackwell
|
|
1
|
14
|
August 13, 2025
|
In multi-processing, when one process exits unexpectedly, how to get others out of hang?
|
|
0
|
9
|
August 13, 2025
|
What are liquid neural networks?
|
|
1
|
25
|
August 13, 2025
|
TrivialAugmentWide on segmentation
|
|
2
|
355
|
August 13, 2025
|
Forward autodiff : Multiplying by python float changes the dual dtype in some situations
|
|
4
|
58
|
August 13, 2025
|
Compiling from source a static PyTorch library
|
|
4
|
48
|
August 13, 2025
|
Cleanup for multiprocess DataLoader
|
|
0
|
11
|
August 13, 2025
|
"No kernel image is available for execution on this device"
|
|
1
|
9
|
August 12, 2025
|
Pytorch İnference
|
|
2
|
20
|
August 12, 2025
|
Numerical mismatch between flash_attn_qkvpacked_func and flex_attention with token-level mask
|
|
1
|
10
|
August 12, 2025
|
RNN isn't learning, unsure what I'm doing wrong
|
|
10
|
63
|
August 12, 2025
|
Request to Share the Survey with your Software Developers and Code Reviewers
|
|
0
|
8
|
August 12, 2025
|