Backward Function Produces an Error After zero_infinity=True

paarandika · February 5, 2020, 6:02am

Hi,
I have implemented a CRNN model based on the paper. I have used CTC loss implemented in Pytorch and the model was working fine with one dataset. However, when I changed the dataset it produced nan loss. To avoid that I tried setting zero_infinity=True. Then the backward function produces the following error.
RuntimeError: The size of tensor a (0) must match the size of tensor b (93) at non-singleton dimension 2
This happens with both datasets. Does zero_infinity change the shape of tensors? ('m doing some tensor transformations within the model to map CNN output to LSTM input) Can someone point to me what I’m doing incorrectly here?
Thanks.

albanD · February 5, 2020, 1:38pm

Hi,

Where do you set zero_infinity=True ?
Also have you tried reducing the learning rate to prevent your loss from diverging?

paarandika · February 5, 2020, 7:18pm

Hi,
At the CTC loss initiation.

crit = CTCLoss(zero_infinity = True).to(device)

I haven’t tried reducing the learning rate. I will try that.
Thanks.

albanD · February 5, 2020, 8:11pm

Interesting. Can you make a small code repro (30 lines) that reproduces this issue please?

paarandika · March 16, 2020, 1:53am

Hi,
I’m using the model from this repo.
https://github.com/Holmeyoung/crnn-pytorch

AlexanderKazakov · December 8, 2020, 10:22pm

I have also faced that error. I’ve created a bug report here https://github.com/pytorch/pytorch/issues/49046