Training with batch_size = 1, all outputs are the same and trains poorly

agt (agt) November 13, 2020, 2:23am 2

I thought it might be this (Outputs from a simple DNN are always the same whatever the input is), but model.state_dict() suggests the weights and biases are all on a similar scale. @ptrblck could you lend a hand?