I am looking for a minimal PyTorch code that would cause the following error.
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling
cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)
The reason for this request is that I get this error in two out of my three machines and I am not sure what’s the best way to go about debugging it in a large code-base. Here’s the related post Vertices=torch.matmul(vertices.unsqueeze(0), rotations_init), RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemmStridedBatched in CentOS