Average gradient calculated graph

image

this is how my model’s gradient average per layer looks like during training.

according to this graph, can vanishing gradient be a problem?