Basic operations do not work in 1.1.0 with uWSGI+Flask

Came across the problem, that in pytorch 1.1.0 with uwsgi+flask on cpu even torch.cat does not work (everything just freezes without errors). I determined that problem is in uwsgi/flask, because in the same environment I was able to do the same operations without any issues. There was no such problem in previous versions of pytorch. As far as I understand, I am dealing with multiprocessing artifacts…
Nevertheless I found the solution and it was simple:

app = flask.Flask(__name__)
segmentator = None

@app.before_first_request
def load_segmentator():
    global segmentator
    segmentator = Segmentator()

where Segmentator is a class with pytorch’s nn.Module, which loads weights in __init__
Hope, it will help somebody.

P.S. If someone explains what is going on here, will be grateful.

3 Likes

I’m facing the same issue. i tried a similar solution but still doesn’t seem to work. Any more information about this would be appreciated

1 Like

To be honest I’m not really understand why it worked and cannot say much more in terms of code.
Here is my “code” inside Segmentator:

class Segmentator:
    def __init__(self, path):
        print('loading model')
        loaded_model = UNetWithResnet50Encoder(9)
        print('loading weights')
        checkpoint = torch.load(path, map_location='cpu')
        print('loading checkpoint')
        loaded_model.load_state_dict(checkpoint['model'])
        loaded_model.eval()
        print('loaded!')
        self.model = loaded_model

    def process(img_url):
        # loads image into RAM, preprocesses it,
        # passes trough self.model, and postprocesses output
        pass
2 Likes

@lebionick thanks for the info. I actually have the exact same project structure as your code. Segmentor just named differently :slight_smile: I had another solution for another app but I changed the solution to be exactly like yours with the before_first_request still to no success. I will report back if I find out more about this.

2 Likes

ok it seems load_state_dict and other operations in __init__ when run after flask.Flask(__name__) cause pytorch operations done in requests to hang forever. Not sure about the cause but maybe this info could help someone

3 Likes

Try this

1 Like

Don’t you running the flask app with uWSGI’s “preforking worker mode” that is a basic config?
Try lazy-apps mode that each worker will load the trained model for them self and not share with others.
It works in my environment even I load model in the global.

command
uwsgi --lazy-app

or ini file

[uwsgi]
lazy-apps = true

minimum reproduction

1 Like

By setting lazy-apps to be true, it solves the problem I had with flask+uwsgi+pytorch deployment. Thx.