I read some topics, and know that most of the layers are initialized by kaiming_uniform_initialization
, including Conv2D
, Linear layers
…And when i use these default initialization, my VGG runs well.
I try to reset these layers’ initialization with my own initialize_weights
function, in my function, i use nn.init.kaiming_uniform_ as well. But when i train my VGG with same hyperparameters, loss fly to NAN.
Is there any wrong in my codes?Thanks for your help!!
def initialize_weights(layer):
if isinstance(layer, nn.Conv2d):
nn.init.kaiming_uniform_(layer.weight, mode='fan_in', nonlinearity='relu')
if layer.bias is not None:
nn.init.constant_(layer.bias, 0)
model.apply(initialize_weights)