Tackling Low GPU Kernel Occupancy During Loss Function Computation
|
|
1
|
69
|
April 7, 2025
|
Memory used by `autograd` when `torch.scatter` is involved
|
|
9
|
107
|
April 7, 2025
|
Image segmentation advice
|
|
4
|
107
|
April 7, 2025
|
Model() uses GPU but backwards() doesn't
|
|
2
|
141
|
April 7, 2025
|
Tuning a network, subset of data, one non-frozen layer
|
|
0
|
18
|
April 7, 2025
|
Question about communicator of P2P
|
|
0
|
35
|
April 7, 2025
|
How to save/load PyTorch model with custom Storage pickling
|
|
0
|
28
|
April 7, 2025
|
Problem with FSDP, custom gradient
|
|
0
|
84
|
April 6, 2025
|
ERROR: Could not find a version that satisfies the requirement pytorch-triton==2.3.1+958fccea74
|
|
0
|
204
|
April 6, 2025
|
unstable WGAN-GP gradients
|
|
1
|
78
|
April 6, 2025
|
How to use transfomers and accelerate library to improve model performance
|
|
7
|
172
|
April 6, 2025
|
How to set the default device to torch::kCUDA?
|
|
5
|
1069
|
April 6, 2025
|
ExecuTorch: Variable-Length Inputs for Export and F.maxpool1d
|
|
0
|
18
|
April 6, 2025
|
The code aims to collect data about SiLU (Sigmoid Linear Unit) activation layers in a quantized YOLOv5 model. Specifically, it: Creates a custom SiLUDataCollector to replace SiLU layers Captures quantization parameters (scale and zero point) Saves quanti
|
|
1
|
66
|
April 6, 2025
|
How to custom a quantizar using fx
|
|
1
|
40
|
April 6, 2025
|
Question on quantize_per_channel and dequantize
|
|
5
|
128
|
April 6, 2025
|
Quantized LLM inference vs quantized matrix multiplication speed in CPU
|
|
2
|
107
|
April 6, 2025
|
What is the common base class for my module (holders) in libtorch?
|
|
2
|
58
|
April 5, 2025
|
AMP mixed precision in custome module: RuntimeError: hook '<lambda>' has changed the type of value
|
|
0
|
17
|
April 5, 2025
|
CUDA Error: an illegal memory access was encountered with ProArt 4080 Super
|
|
1
|
70
|
April 4, 2025
|
Torch.cuda.device_count() return 1 but torch.cuda.is_available() false
|
|
1
|
143
|
April 4, 2025
|
Export PyTorch documentation into PDF form
|
|
5
|
2555
|
April 4, 2025
|
PyTorch with CUDA 12.6 and Ubuntu 24.04
|
|
3
|
2117
|
April 4, 2025
|
BatchMemoryManager with Opacus in Lightning
|
|
4
|
880
|
April 4, 2025
|
Cross_entropy_loss error in customise YOLO for predicting objects and attributes
|
|
0
|
30
|
April 4, 2025
|
Is there a way to visualize the gradient path of the back propagation of the entire network
|
|
7
|
15414
|
April 4, 2025
|
RuntimeError: CUDA error: device kernel image is invalid
|
|
3
|
646
|
April 4, 2025
|
Parity mismatch between torch and tensorRT for ultralytics yolo-v11 model
|
|
0
|
67
|
April 4, 2025
|
Forward Mode AD Example on MNIST
|
|
2
|
70
|
April 3, 2025
|
Reshaping tensors while using model parallelism
|
|
0
|
50
|
April 3, 2025
|