Deadlock during backward

There shouldn’t be any multiprocessing kicking in… However, there might be multithreading.

Is this op-depedent, i.e. what if you use other ops than mm and pow?

Yes, it does seem that mm alone causes trouble. At least, sum and pow don’t run into deadlocks.

Do you know what blas it might be using?

Nope. How do I find out?

How was it installed? Was it via conda?

Yes, I’m mostly certain about it.

Hmm then it should be mkl. And this is CPU code right?

The deadlock happens with pytorch-gpu on RHEL but yea the code doesn’t use cuda.