Torch.inverse cuda error

I used “a = torch.inverse(mat_inverse)” line in my code. It works fine on google colab but gives below error when run on GPU.

RuntimeError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

With CUDA_LAUNCH_BLOCKING=1 it shows below error. Is it related to memory requirement?
RuntimeError: CUDA error: invalid argument

Could you post a minimal code snippet to reproduce the issue as well as the output of python -m torch.utils.collect_env, please?

It was memory issue as GPU is shared.

An invalid argument error is usually not raised if you are running out of memory.
Do you remember which operation raised it when you saw the stack trace (in case you were running with CUDA_LAUNCH_BLOCKING=1)?

You are right. Its not memory issue. The function gives error is “torch.inverse”. Please find the details you required. Let me know if you could resolve.

PyTorch version: 1.10.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.2 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Libc version: glibc-2.25

Python version: 3.6.9 (default, Jan 26 2021, 15:33:00)  [GCC 8.4.0] (64-bit runtime)
Python platform: Linux-5.4.0-91-generic-x86_64-with-Ubuntu-18.04-bionic
Is CUDA available: True
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: Quadro P5000
Nvidia driver version: 450.119.03
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.4
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] torch==1.10.0+cu113
[pip3] torchfile==0.1.0
[pip3] torchvision==0.11.1+cu113
[pip3] torchviz==0.0.2

Could you post an executable code snippet to reproduce the issue?

Its strange but changing the array shape from (1,3,256,256) to (1, 14400, 3, 3) in below code shows error in GPU. On google colab i verified and its working fine. Any solution and cause of issue?

import torch
#mat_inverse=torch.randn(1,3,256,256).to(0)
mat_inverse=torch.rand(1, 14400, 3, 3).to(0)
a=torch.inverse(mat_inverse)
print(a)

Have you arrived to some conclusion?