It could be the reason for an initially different loss value, but I’m not sure if it could also explain the same loss at the beginning which diverges later.
You can use the torch.nn.init methods via model.apply as e.g. shown here.
It could be the reason for an initially different loss value, but I’m not sure if it could also explain the same loss at the beginning which diverges later.
You can use the torch.nn.init methods via model.apply as e.g. shown here.