I don’t think I found any info on this online but what are the default filters used by Conv2d?

I think filters are initialized with random values and they are updated via backpropagation.

Wait the filter values are updated, meaning that they likely change, every epoch?

As per my intutions, they should update every time when we do loss.backward().

Am I in the right directions, experts?

`loss.backward()`

calculates all gradients in the current computation graph (including the filter weight gradients). The weights are updated if you call `optimizer.step()`

(and passed these parameters to the optimizer before) or update the weights manually using the gradients.

@modeler in fact the weights change in every iteration if the gradients are non-zero.

What is the reasoning for a randomized initialization as opposed to some non-zero constant initialization?

I think symmetry breaking might be one reason, although the problem of equal outputs won’t be that serious like in linear layers.

E.g. if you initialize a linear layer with some constant weight, each output will have the same value. Later in the backward pass this could create the same weight updates for each parameter etc. Each weight thus cannot learn anything “new”, and you would have a whole layer of a cloned neuron.

Random initialization breaks this symmetry.

Also, I think another reason might be that your constant values might bias your model towards a particular solution, which might be useful, if you know what you are doing.

In addition to Peter’s spot-on comments about symmetry breaking, there is a the lottery ticket hypothesis, roughly speaking the theory that (overparametrised by traditional standards) NNs are “looking in many places of the parameter landscape, thereby picking up some useful ones”.

Weight initialization in particular is something that has been identified as fairly important and I can recommend spending thought on it - PyTorch inherits the initializations mostly from Torch, and might not always reflect the latest advice of how to do it. Most stock modules have a method `reset_parameters`

that has the default (e.g. do `?? torch.nn.Conv2d.reset_parameters`

to see the source in IPython/Jupyter).

In contrast to weight, bias can, in my experience, often just be zeroed.

Best regards

Thomas

Hello ptrblck

I hope you are well. Sorry , I have a question about initialize the filters in pytorch. how I can specify them randomly from Gaussian distribution? are they from Gaussian distribution?

The filters in `nn.Conv2d`

are initialized by reset_parameters as @tom mentioned.

To initialize them with a Gaussian distribution, you could use `torch.nn.init.normal_`

.

i used Convd3. Where can I add this command in my code?

class ConvNetRedo1(nn.Module):

def **init**(self,numf1,numf2,fz1,fz2,nn2,nn3): # numf1( nnumber of filters first layer)numf2(nnumber of filters first layer)),fz1 kernel size(),fz2,nn2,nn3

super(ConvNetRedo1, self).**init**()

self.numf1=numf1

self.numf2=numf2

self.fz1=fz1

self.fz2=fz2

self.nn2=nn2

self.nn3=nn3

self.layer1 = nn.Sequential(nn.Conv3d(1, self.numf1, kernel_size=self.fz1, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))

self.layer2 = nn.Sequential(nn.Conv3d(self.numf1,self.numf2, kernel_size=self.fz2, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))

```
self.fc1 = nn.Linear(3072, self.nn2) ##3027
self.drop_out1 = nn.Dropout(0.5)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear( self.nn2, self.nn3)
self.drop_out2 = nn.Dropout(0.5)
self.relu2 = nn.ReLU()
self.fc3 = nn.Linear( self.nn3, 1)
```

You could write a `weight_init`

function and call `model.apply`

with it:

```
def weights_init(m):
with torch.no_grad():
if isinstance(m, nn.Conv3d):
torch.nn.init.normal_(m.weight)
torch.nn.init.normal_(m.bias)
net.apply(weights_init)
```

Is it correct to use??

def weights_init(m):

with torch.no_grad():

if isinstance(m, nn.Conv3d):

torch.nn.init.normal_(m.weight)

torch.nn.init.normal_(m.bias)

class ConvNetRedo1(nn.Module):

def **init**(self,numf1,numf2,fz1,fz2,nn2,nn3):

super(ConvNetRedo1, self).**init**()

self.numf1=numf1

self.numf2=numf2

self.fz1=fz1

self.fz2=fz2

self.nn2=nn2

self.nn3=nn3

self.layer1 = nn.Sequential(nn.Conv3d(1, self.numf1, kernel_size=self.fz1, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))

net.apply(weights_init)

```
self.layer2 = nn.Sequential(nn.Conv3d(self.numf1,self.numf2, kernel_size=self.fz2, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))
```

net.apply(weights_init)

```
self.fc1 = nn.Linear(3072, self.nn2) ##3027
self.drop_out1 = nn.Dropout(0.5)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear( self.nn2, self.nn3)
self.drop_out2 = nn.Dropout(0.5)
self.relu2 = nn.ReLU()
self.fc3 = nn.Linear( self.nn3, 1)
def forward(self, x):
x=x.unsqueeze(1).float()
out = self.layer1(x)
out = self.layer2(out)
out = out.view(out.size(0), -1)
out = self.fc1(out)
out=self.drop_out1(out)
out=self.relu1(out)
out = self.fc2(out)
out=self.drop_out2(out)
out=self.relu2(out)
out = self.fc3(out)
return out
```

Your format is a bit broken, but it seems you are trying to call `net.apply`

inside the model definition?

PS: you can add code snippets by wrapping them in three backticks ```

Yes, I use ( `net.apply(weights_init)`

) after each CNN layer.

```
with torch.no_grad():
if isinstance(m, nn.Conv3d):
torch.nn.init.normal_(m.weight)
torch.nn.init.normal_(m.bias)
class ConvNetRedo1(nn.Module):
def __init__(self,numf1,numf2,fz1,fz2,nn2,nn3): # numf1( nnumber of filters first layer)numf2(nnumber of filters first layer)),fz1 kernel size(),fz2,nn2,nn3
super(ConvNetRedo1, self).__init__()
self.numf1=numf1
self.numf2=numf2
self.fz1=fz1
self.fz2=fz2
self.nn2=nn2
self.nn3=nn3
self.layer1 = nn.Sequential(nn.Conv3d(1, self.numf1, kernel_size=self.fz1, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))
net.apply(weights_init)
self.layer2 = nn.Sequential(nn.Conv3d(self.numf1,self.numf2, kernel_size=self.fz2, stride=1, padding=2),nn.ReLU(),nn.MaxPool3d(kernel_size=2, stride=2))
net.apply(weights_init)
self.fc1 = nn.Linear(3072, self.nn2)
self.drop_out1 = nn.Dropout(0.5)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear( self.nn2, self.nn3)
self.drop_out2 = nn.Dropout(0.5)
self.relu2 = nn.ReLU()
self.fc3 = nn.Linear( self.nn3, 1) ```
Is it correct?
```

Ah OK, you don’t need to call this method after each layer.

Just initialize your model and call it once via `model.apply(weight_init)`

as shown in my example.

`model.apply`

will recursively pass each module (and submodule, …) to the passed function.

Sth Like that?

````` `def weights_init(m):`

` `

`with torch.no_grad():`

` `

`if isinstance(m, nn.Conv3d):`

` `

`torch.nn.init.normal_(m.weight)`

` `

`torch.nn.init.normal_(m.bias)`

model = ConvNetRedo1(32,64,(7,7,5),(5,5,3),500,100)

`model.apply(weight_init)`

`model=model.cuda`

`for epoch in range(num_epochs):`

` `

`for i, data in enumerate(trainloader,0):`

` `

`images, labels=data `

` `

`optimizer.zero_grad()`

` `

outputs= model(images)````

Yes, that looks generally right.

You should call `model = model.cuda()`

as a method (with parentheses), but the initialization work flow looks correct.

PS: It looks like you are adding each line in the format ticks `. Just add ``` before and after the complete code block (or use the “Preformatted text” button.

I really appreciate your help

Excuse me what is (net.apply(weights_init)) in your code. I did not use it . is it correct? Or I missed it?

```
with torch.no_grad():
if isinstance(m, nn.Conv3d):
torch.nn.init.normal_(m.weight)
torch.nn.init.normal_(m.bias)
net.apply(weights_init)
```