Hello,

I’m a bit confused about weight initialization. In my neural network I use: BatchNorm1d, Conv1d, ELU, MaxPool1d, Linear, Dropout and Flatten.

Now I think only Conv1D, Linear and ELU have weights right? In particular:

Conv1D: Has weights for the weighted sum it uses.

ELU: Has alpha as a weight

Linear: Weights represent basically the transformation matrix

**Question 1:**

Now all those weights need to be set to something in the beginning. I know that for symmetrical activation functions, one uses Xavier and for thinks like ReLU (and I guess ELU) one uses Kaiming to set these weights. Correct?

**Question 2:**

What is used for the weights for Linear?

**Question 3:**

What is used for Conv1D weights?

**Question 4:**

What are the default weights set by pyTorch? I guess they are:

- Linear: alpha: float = 1.
- Conv1D: U(-sqrt(k), sqrt(k)) with k = groups / (Cin*kernel siye) whereas k = 1 by default.
- ELU: alpha = 1.0

Correct?

**Question 5:**

Do people set the weights only at the beginning or are there usecases where one does it while training?

**Question 6:**

What is the correct way of initialize weights?

Thanks in advance