It’s the same thing as Pytorch’s with torch.no_grad():
, it’s just the C++ equivalent. They talk more about layer initialization and grad guards in this thread: Initialization in-place and `tensor.data`, and it’s also explained in the documentation here: Autograd mechanics — PyTorch 2.1 documentation
The implementations in torch.nn.init also rely on no-grad mode when initializing the parameters as to avoid autograd tracking when updating the intialized parameters in-place.