@tom, @ptrblck, Thank you for your time.
As per discussion what I understand is that square root of zeros in in backward pass of std is causing problem. I am wondering then why Conv1d → BatchNorm1d → ReLU for TDNN is not causing any problem.
If problem is in computation of std, can adding eps(1e-6) during computation of torch.std is correct and possible to handle this?? I mean something like this
I might be wrong as I am not too good in maths and specially when it comes to implementation.