Hello,
I’m a bit confused about weight initialization. In my neural network I use: BatchNorm1d, Conv1d, ELU, MaxPool1d, Linear, Dropout and Flatten.
Now I think only Conv1D, Linear and ELU have weights right? In particular:
Conv1D: Has weights for the weighted sum it uses.
ELU: Has alpha as a weight
Linear: Weights represent basically the transformation matrix
Question 1:
Now all those weights need to be set to something in the beginning. I know that for symmetrical activation functions, one uses Xavier and for thinks like ReLU (and I guess ELU) one uses Kaiming to set these weights. Correct?
Question 2:
What is used for the weights for Linear?
Question 3:
What is used for Conv1D weights?
Question 4:
What are the default weights set by pyTorch? I guess they are:
- Linear: alpha: float = 1.
- Conv1D: U(-sqrt(k), sqrt(k)) with k = groups / (Cin*kernel siye) whereas k = 1 by default.
- ELU: alpha = 1.0
Correct?
Question 5:
Do people set the weights only at the beginning or are there usecases where one does it while training?
Question 6:
What is the correct way of initialize weights?
Thanks in advance