When I create a PyTorch model, how do I print the number of trainable parameters? They have such features in Keras but I don’t know how to do it in PyTorch.
for parameter in model.parameters(): print(parameter)
how does one make sure
model.parameters() is not empty? It seems to be empty for me.
Can I see what you are trying to do? The parameters should not be empty unless you have something like:
class Model(nn.Module): def __init__(self): super(model, self).__init__() model = Model()
The above model has no parameters.
you can see it here: How does one make sure that the custom NN has parameters?
class NN(torch.nn.Module): def __init__(self, D_layers,act,w_inits,b_inits,bias=True): super(type(self), self).__init__() # actiaction func self.act = act #create linear layers self.linear_layers = [None] for d in range(1,len(D_layers)): linear_layer = torch.nn.Linear(D_layers[d-1], D_layers[d],bias=bias) self.linear_layers.append(linear_layer)
I posted my response on your original question!
def get_n_params(model): pp=0 for p in list(model.parameters()): nn=1 for s in list(p.size()): nn = nn*s pp += nn return pp
To compute the number of trainable parameters:
model_parameters = filter(lambda p: p.requires_grad, model.parameters()) params = sum([np.prod(p.size()) for p in model_parameters])
I like this solution!
To add my 50 cents, I would use
numel() instad of
np.prod() and compress the expression in one line:
def count_parameters(model): return sum(p.numel() for p in model.parameters() if p.requires_grad)
Why sum of parameter of model increase after each iteration during train the model?
Provided the models are similar in keras and pytorch, the number of trainable parameters returned are different in pytorch and keras.
from torch import nn
from torchvision import models
a.fc = nn.Linear(512,2)
count = count_parameters(a)
Now in keras
import keras.applications.resnet50 as resnet
model =resnet.ResNet50(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=2)
Total params: 23,591,810
Trainable params: 23,538,690
Non-trainable params: 53,120
Any reasons why this difference in numbers pop up?
Hi Alex, well spotted. I never did this comparison.
One easy check it to compare the layers one by one, (Linear, Conv2d, BatchNorm etc.), and see if there’s any difference in the number of params.
However, I don’t think there will be any difference, provided that you pay attention to the sneaky default parameters.
After that, you can patiently compare the graphs layer by layer and see if you spot any difference. Maybe it’s a matter of omitted/shared biases in some of the layers.
Btw, the first test is also a good check for the
count_parameters() function, let us now if you discover some unexpected behavior
Have you checked if they are the bias weights?
I guess this counts shared parameters multiple times, doesn’t it?
import torch from models.modelparts import count_parameters class tstModel(torch.nn.Module): def __init__(self): super().__init__() self.p = torch.nn.Parameter( torch.randn(1, 1, 1, requires_grad=True) .expand(-1, 5, -1) ) print(count_parameters(tstModel()))
If I understand correctly, expand just creates tensor with 5 views to the same parameter, so the right answer should be
But I don’t know how to fix that.
did anyone figure out a solution for shared parameters?