Running multiple instances of same program in single GPU

I have a program with a simple dense layer model, which takes around 2GB/24GB of GPU memory (3090). I want to run multiple instance of the same program , with differet config files, with the same GPU.

I used to be able to do it on other machines, but getting the following error

RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable. CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.

Should I make any changes so that it is not running in compute exclusive mode? there is 20gb of free GPU memory but unable to access it.

Edit: The GPU compute mode was set to exclusive process, changing it back to default did the trick.

Did you find any solution for this idea ?