How do modules like `nn.Conv2d` allocate memory?

I’m working throught the tutorials and I wanted to understand how nn.Conv2d is created. It looks like we can pass in a device param during the init function and I’ve checked that this param create a tensor in the GPU (rather than CPU by default). What I’m wondering is if I wanted to create an empty tensor instead of random values for the layer, how do I do that? It seems like by default random values are setup for that tensor.

A follow up question, If I use the load_state_dict method to load a model to a GPU allocated tensor, I’m assuming the method would just overwrite the values that were randomly setup?

PyTorch layers are first set to empty, and then call self.reset_parameters() to initialize them with a kaiming uniform distribution. See the init for ConvNd here: torch.nn.modules.conv — PyTorch 2.1 documentation

As to your second question, that is correct. The previous model values are overwritten by the saved state.

To skip the initialization of parameters avoiding wasting compute as you will overwrite these, you could use torch.nn.utils.skip_init.