How do I check the number of parameters of a model?

ZeweiChu · June 26, 2017, 2:31pm

When I create a PyTorch model, how do I print the number of trainable parameters? They have such features in Keras but I don’t know how to do it in PyTorch.

sinhasam · June 26, 2017, 4:15pm

for parameter in model.parameters():
    print(parameter)

Brando_Miranda · August 8, 2017, 11:06pm

how does one make sure model.parameters() is not empty? It seems to be empty for me.

sinhasam · August 9, 2017, 12:22am

Can I see what you are trying to do? The parameters should not be empty unless you have something like:

class Model(nn.Module):
    def __init__(self):
        super(model, self).__init__()
model = Model()

The above model has no parameters.

Brando_Miranda · August 9, 2017, 12:54am

you can see it here: How does one make sure that the custom NN has parameters?

class NN(torch.nn.Module):
    def __init__(self, D_layers,act,w_inits,b_inits,bias=True):
        super(type(self), self).__init__()
        # actiaction func
        self.act = act
        #create linear layers
        self.linear_layers = [None]
        for d in range(1,len(D_layers)):
            linear_layer = torch.nn.Linear(D_layers[d-1], D_layers[d],bias=bias)
            self.linear_layers.append(linear_layer)

sinhasam · August 9, 2017, 1:22am

I posted my response on your original question!

kulikovv · December 5, 2017, 1:03pm

def get_n_params(model):
    pp=0
    for p in list(model.parameters()):
        nn=1
        for s in list(p.size()):
            nn = nn*s
        pp += nn
    return pp

vsmolyakov · December 6, 2017, 5:15am

To compute the number of trainable parameters:

model_parameters = filter(lambda p: p.requires_grad, model.parameters())
params = sum([np.prod(p.size()) for p in model_parameters])

baldassarre.fe · February 20, 2018, 12:57pm

I like this solution!

To add my 50 cents, I would use numel() instad of np.prod() and compress the expression in one line:

def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

varghese_alex · March 21, 2018, 5:07am

Provided the models are similar in keras and pytorch, the number of trainable parameters returned are different in pytorch and keras.

import torch
import torchvision
from torch import nn
from torchvision import models

a= models.resnet50(pretrained=False)
a.fc = nn.Linear(512,2)
count = count_parameters(a)
print (count)
23509058

Now in keras

import keras.applications.resnet50 as resnet

model =resnet.ResNet50(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=2)
print model.summary()

Total params: 23,591,810
Trainable params: 23,538,690
Non-trainable params: 53,120

Any reasons why this difference in numbers pop up?

baldassarre.fe · March 21, 2018, 7:50am

Hi Alex, well spotted. I never did this comparison.

One easy check it to compare the layers one by one, (Linear, Conv2d, BatchNorm etc.), and see if there’s any difference in the number of params.
However, I don’t think there will be any difference, provided that you pay attention to the sneaky default parameters.

After that, you can patiently compare the graphs layer by layer and see if you spot any difference. Maybe it’s a matter of omitted/shared biases in some of the layers.

Btw, the first test is also a good check for the count_parameters() function, let us now if you discover some unexpected behavior

GWSurfer · June 28, 2018, 1:05pm

Have you checked if they are the bias weights?

mfajcik · November 14, 2018, 10:20am

I guess this counts shared parameters multiple times, doesn’t it?

import torch
from models.modelparts import count_parameters
class tstModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.p = torch.nn.Parameter(
            torch.randn(1, 1, 1, requires_grad=True)
                .expand(-1, 5, -1)
        )
print(count_parameters(tstModel()))

prints 5
If I understand correctly, expand just creates tensor with 5 views to the same parameter, so the right answer should be 1.
But I don’t know how to fix that.

hughperkins · December 30, 2018, 3:05pm

did anyone figure out a solution for shared parameters?

ash95 · November 27, 2019, 7:00am

So I get that by default, Conv2d includes the bias. But I’m unclear as to why they (the biases) are being included in ‘requires_grad’.

In [1]: conv_3 = nn.Conv2d(512, 256, kernel_size=3, bias=True)

In [2]: sum(p.numel() for p in conv_3.parameters())
Out[2]: 1179904
In [3]: sum(p.numel() for p in conv_3.parameters() if p.requires_grad)
Out[3]: 1179904

ptrblck · November 27, 2019, 7:06am

The bias is a trainable parameter, which requires gradients and is optimized in the same way as the weight parameter.
Do you have a use case, where the bias is fixed to a specific value?

ash95 · November 27, 2019, 7:12am

Ah sorry. It was a conceptual error on my part. I had confused the idea of bias being a constant value with a weight with bias being a constant value.
Thanks for the clarification.

Brando_Miranda · April 11, 2020, 7:28pm

just out of curiosity, is there a np.prod for pytorch?

baldassarre.fe · April 11, 2020, 11:40pm

Well, there’s torch.prod, but unlike numpy it accepts only tensors and does not accept tuples, lists, etc.

arshishir · May 14, 2020, 5:14am

You may find this useful: https://pypi.org/project/pytorch-model-summary/