I see in some caffe file that some convolution layers are initialized like this:
layer {
bottom: "conv1_1"
top: "conv1_2"
name: "conv1_2"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
}
num_output: 64
pad: 1
kernel_size: 3
}
}
However, I found pytorch provides kaiming_normal and kaiming_uniform. What is the pytorch way to initialize the convolution layer equally as the caffe way ?