|
About the autograd category
|
|
0
|
4036
|
May 13, 2017
|
|
How to get the version numbers of a Module's Parameters?
|
|
5
|
845
|
March 13, 2026
|
|
Requires_grad becomes false after some operation
|
|
3
|
51
|
March 3, 2026
|
|
Problem of freeze metrics after first epoch
|
|
1
|
44
|
February 28, 2026
|
|
Function 'Scaled Dot Product Efficient Attention Backward0' returned nan values in its 0th output
|
|
13
|
2127
|
February 9, 2026
|
|
RNN memory management: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
|
|
2
|
39
|
February 6, 2026
|
|
About the sign of gradients from token probability w.r.t. intermediate activations during inference
|
|
1
|
32
|
January 12, 2026
|
|
Using `forward_pre_hook` to attribute CUDA OOMs to module execution context
|
|
2
|
54
|
January 8, 2026
|
|
PyTorch CPU RAM Usage Grows Rapidly When Assembling Forces from CNN Output—How to Prevent Memory Leak?
|
|
0
|
43
|
December 11, 2025
|
|
Simple extension of autograd saved tensor hook mechanism
|
|
4
|
50
|
December 10, 2025
|
|
Does scaled_dot_product_attention's backward support reproduce
|
|
0
|
33
|
December 10, 2025
|
|
How to make a manually changed loss work in backpropagation
|
|
1
|
41
|
December 3, 2025
|
|
torch.autograd.Function and free function
|
|
2
|
51
|
December 3, 2025
|
|
Autograd and dead-code elimination
|
|
2
|
97
|
November 18, 2025
|
|
Does PyTorch muon optimizer supports 4D weights?
|
|
1
|
111
|
November 17, 2025
|
|
Gradient ascent on some parameters while descent on others in a single model
|
|
3
|
77
|
November 11, 2025
|
|
Batchnorm and back-propagation
|
|
8
|
4075
|
November 3, 2025
|
|
How to debug origin of nans in gradient of custom module
|
|
3
|
81
|
October 29, 2025
|
|
Optimizing a mask instead of weights
|
|
2
|
72
|
October 27, 2025
|
|
PyTorch AD with non-python functions
|
|
1
|
47
|
October 22, 2025
|
|
Brenier maps in Pytorch
|
|
1
|
65
|
October 18, 2025
|
|
Updating Selected parameters in each epoch
|
|
1
|
43
|
October 17, 2025
|
|
No grad & Autocast not working together
|
|
1
|
58
|
October 14, 2025
|
|
How to do back propagation with loss = ||f_{\Theta + \Delta P}(X) - Y||^2 + ||\Delta P||^2 instead of the usual loss = || f_{\Theta + \Delta}(X) - Y ||^2?
|
|
2
|
45
|
October 14, 2025
|
|
"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: the backtrace further a
|
|
10
|
33326
|
October 9, 2025
|
|
Defining loss that maximizes separation
|
|
0
|
45
|
October 1, 2025
|
|
Ablation Hook I created seems to affect the calculations of the gradients on later layers
|
|
5
|
83
|
September 30, 2025
|
|
How do I set the order of pytorch hooks?
|
|
5
|
90
|
September 29, 2025
|
|
Updating tensors that are used in backpropagation but are not network parameters
|
|
2
|
97
|
September 20, 2025
|
|
Making autograd saved tensors hooks specific to certain arguments
|
|
8
|
304
|
September 17, 2025
|