Infer Torch model via gunicorn (wsgi)

gvskalyan · November 9, 2019, 4:22am

gunicorn is used as gunicorn app:app --preload --workers 3
Preload is used to share the resources among the workers.
Set the OMP_NUM_THREADS to 2.

app.py contains the following code

from flask import Flask, jsonify
import torch  
from create_model import testtype or paste code here

app = Flask(__name__) 
model = torch.load('model.pt')

@app.route('/predict',methods = ['POST', 'GET']) 
def prediction(): 
    constant_input = torch.randn(20, 16, 50, 100)
    prediction = model(constant_input)
    return jsonify(prediction)

model.pt is created using create_model.py containing

import torch
import torch.nn as nn
import torch.nn.functional as F

class test(nn.Module):
    def __init__(self):
        super(test, self).__init__()
        self.conv1 = nn.Conv2d(16, 33, 3, stride=2)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return x

m = test()
input = torch.randn(20, 16, 50, 100)
print(m(input))
torch.save(m, 'model.pt')

But I am not able to infer it .
Instead of using a torch model if I use some numpy operation and just return its output, it is able to.

Though using gunicorn app:app --preload --workers 3 --threads 2 I am able to infer. But anyone please tell me why does it differ only when threads are used and that too in case of pytorch.
Thanks.

gvskalyan · November 11, 2019, 3:54am

@ptrblck Any idea? .

miguelvr · November 13, 2019, 10:36am

Do not use torch.save() on your the class object. Either save the state_dict or use the a traced version of the model.

Saving the model with JIT:

traced_model = torch.jit.trace(model, torch.randn(1, 16, 50, 100))
torch.jit.save(traced_model, output_path)

Loading the model in the server code:

model = torch.jit.load(model_path, map_location='cpu')

The rest should be the same.

Just a tip on your code style: do not start class names with lowercase, it’s not pythonic