Does model.parameters() return the parameters in topologically sorted order?

Goal: To list model parameters in the sequence of their execution during forward pass, basically from input layer to the output layer. Or in the order of their execution in computation graph.

Does doing this will guarantee that the parameters are traversed in topologically sorted order of their execution:

for name,param in model.named_parameters():
    print(name, param.shape)

No, this will print the parameters in the order as they were registered:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc3 = nn.Linear(1, 1)
        self.fc2 = nn.Linear(1, 1)
        self.fc1 = nn.Linear(1, 1)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

for name, param in model.named_parameters():
    print(name)

> fc3.weight
fc3.bias
fc2.weight
fc2.bias
fc1.weight
fc1.bias

You could try to register the layers in the same order as their execution order in the forward method would be (if that’s possible for your model).

1 Like

Is there any guaranteed way to traverse in topological order? Any methods ?

I’m really unsure so I’ll just post some ideas.

Since the computation graph is created during the forward pass, it might of course be different for each pass (e.g. if you are using conditions etc.).
You could try to use the grad_fn and call into grad_fn.next_functions to crawl the graph.
However, this would yield you the operations, not necessarily the layers and I’m not sure how hard it would be to create the mapping. :confused:

Hi, I face the same issue. I need to apply forward pass manually (layer by layer) without accessing the forward method. Just I have access to the class instance of the Net. Is it possible to do so?

@ptrblck Can we assume that the order by parameters() is always fixed every time we call it?

I think so, as the internal _parameters is using an OrderedDict as seen here.

1 Like

@ptrblck Could you please assist me in knowing …How is the sum of parameters calculated…
My model is self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=2, padding=1)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=2, padding=1)
self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1)
self.fc1 = nn.Linear(2048, 128)
self.fc2 = nn.Linear(128, 10)
It has 356234 parameters. How is that calculated? I got this number using numel()
Also How can we get the weights of a CNN model in to list. Do we need to access each layer value to assess weight?
Like models[i].conv1.weight.data

The number of parameters can be calculated by iterating all parameters and accumulating their number of elements, which seems to be the approach you have used. PyTorch does not provide a built-in method, so you are executing your code to count all parameters and I don’t know what exactly you are running.

You could either directly access them as seen in your code snippet or iterate the named_parameters() and check if the name contains a weight string.

Thanks @ptrblck for your quick reply I was adding the parameters through total_params = sum(
param.numel() for param in model. Parameters()
)
Weights are part of parameters right ? I mean does that mean there are 356234 weights in the model?

Yes, .parameters() include the weight and bias parameters of the model which are either registered directly or in submodules.

It means there are 356234 trainable values in your model. The parameters naming might be ambigious here but you can also say your model contains 356234 trainable parameters.

But then why when If I am trying to print the length of models[i].conv1.weight.data it returns 32 which is actually the number of neurons of that particular layer.

The sum of all parameters is ~356k while separately each layer would have less weights.

1 Like

Thanks @ptrblck for the information… I have been trying to fetch weights and biases of a model separately and
for k in local_model.keys():
#` print(“([client_models[i].state_dict()”,client_models[i].state_dict()[k])
if “weight” in k:
append in localdict after detaching and flatten()
convert into numpy
#print(“local_dict.append”,local_dict)
if “bias” in k:
append in localdict1 after detaching and flatten()
convert into numpy
hstack localdict and localdict1 to get a dictionary of weights and biases
Now I am passing it to neighbors_fit = neighbors.fit(local_dict2) to get the epsilon value of dbscan clustering but it fails:
TypeError: only size-1 arrays can be converted to Python scalars
ValueError: setting an array element with a sequence.
How can I stack weights and biases and fit it to neighbors.fit() without getting this error.
My model is as follows:
5328
Net1(
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(convblock1): Sequential(
(0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.025, inplace=False)
)
(convblock2): Sequential(
(0): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.025, inplace=False)
)
(trans1): Sequential(
(0): Conv2d(16, 4, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.025, inplace=False)
)
(convblock3): Sequential(
(0): Conv2d(4, 8, kernel_size=(3, 3), stride=(1, 1), bias=False)
(1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.025, inplace=False)
(4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), bias=False)
(5): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU()
(7): Dropout(p=0.025, inplace=False)
(8): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), bias=False)
(9): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): ReLU()
(11): Dropout(p=0.025, inplace=False)
)
(gap): Sequential(
(0): AvgPool2d(kernel_size=8, stride=8, padding=0)
)
(convblock5): Sequential(
(0): Conv2d(16, 10, kernel_size=(1, 1), stride=(1, 1), bias=False)
)
)

Your code is not formated properly and quite unreadable. Could you post a monimal and executable code snippet reproducing the issue?

It will be hard for me to share the entire snippet… However if you can guide me through how can we reshape the extracted weights and biases so that they don’t throw the error:
Local_dict_2=np.hstack((local_dict,local_dict1))
where local_dict is the extracted weights, and local_dict1 is the extracted biases

TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
line 436, in
neighbors_fit = neighbors.fit(local_dict2)
File “/home/geetanjli/.local/lib/python3.9/site-packages/sklearn/neighbors/_unsupervised.py”, line 175, in fit
return self._fit(X)
File “/home/geetanjli/.local/lib/python3.9/site-packages/sklearn/neighbors/_base.py”, line 444, in _fit
X = self._validate_data(X, accept_sparse=“csr”, order=“C”)
File “/home/geetanjli/.local/lib/python3.9/site-packages/sklearn/base.py”, line 577, in _validate_data
X = check_array(X, input_name=“X”, **check_params)
File “/home/geetanjli/.local/lib/python3.9/site-packages/sklearn/utils/validation.py”, line 856, in check_array
array = np.asarray(array, order=order, dtype=dtype)
ValueError: setting an array element with a sequence.