Code stuck in multiprocessing

I have a model and serve it with Flask and it works fine. For accelerating the speed in batch input, I use celery with multi workers and load the model in each worker, but it stucked when one process got into “torch.layer_norm”, a C variable function in the torch package, and it also blocked other process

I wonder why this happen in multiprocessing and how to solve this problem

The “torch.layer_norm” is in “torch._C._VariableFunctions.pyi”