Model with tensor and number operations errors in iOS

Traced a model and saved it using pytorch 1.3.1 with the following code.

class TestModule(torch.nn.Module):
    
    def forward(self, W):
        g = 2 * W
        return g

W = torch.rand(10)

test_model = torch.jit.trace(test_model, [W])
test_model.save("test_model.pt")

and loaded the model into iOS (libtorch 1.3.1) using the following code

torch::jit::script::Module testModel = torch::jit::load(filePath.UTF8String);

torch::IValue w = torch::IValue(torch::rand({10}));

testModel.forward({w});

and it gives the following error on forward

libc++abi.dylib: terminating with uncaught exception of type c10::Error: false CHECK FAILED at /Users/distiller/project/c10/core/Backend.h (tensorTypeIdToBackend at /Users/distiller/project/c10/core/Backend.h:106)
1 Like

Hi @mark_jimenez, can you add these two lines before running forward

torch::autograd::AutoGradMode guard(false);
at::AutoNonVariableTypeMode non_var_type_mode(true);

The first one tells the engine to disable autograd, the second one is sort of a workaround. We can get rid of it in 1.4.0 which will be released soon.

Let me know if you have any questions.

Hey @xta0. It worked when I added those two lines. Thank you!

I discovered a few more issues with the current build (1.3.1). Should I post them in a new thread, or is 1.4 going to be released soon so I can check them in that version?

@mark_jimenez you can post here in this thread, I’ll follow up. In 1.4.0, you still need this line - torch::autograd::AutoGradMode guard(false);, but at::AutoNonVariableTypeMode non_var_type_mode(true); is not necessary.

Thank you @xta0. I’ll ask one issue at a time so its not overwhelming.

I couldn’t include the libtorch in a unit test target. I filed an issue in cocoapods and they said it’s because libnnpack.as isn’t built as a universal library.

I’ve also filed the issue here where you can find more details: https://github.com/pytorch/pytorch/issues/32040

Thank you for the help.

Dear @xta0, I have a similar error.

I have exported a pre-trained model from python to c++ (model.pt) and it works perfectly in a c++ application.
I tried to use the same exported model in my iPad, and it gives the following error on torch::jit::load() method:

libc++abi.dylib: terminating with uncaught exception of type c10::Error: false CHECK FAILED at /Users/distiller/project/c10/core/Backend.h (tensorTypeIdToBackend at /Users/distiller/project/c10/core/Backend.h:106)
(no backtrace available)

When I add the two lines (to disable autograd and the second one) before running torch::jit::load(), it returns:

libc++abi.dylib: terminating with uncaught exception of type c10::Error: !v.defined() || v.is_variable() INTERNAL ASSERT FAILED at/Users/distiller/project/torch/csrc/jit/ir.h (t_ at /Users/distiller/project/torch/csrc/jit/ir.h:718)
(no backtrace available)

Do you have any idea how to solve this problem?

@mark_jimenez this is a known issue, because NNPACK doesn’t support the iOS simulator architecture, as is shown here - https://github.com/Maratyszcza/NNPACK . So for the simulator build, operators are not being run via NNPACK. However, @AshkanAliabadi in our team has been actively working on XNNPACK, which will replace NNPACK in the future. Sorry for the inconvenience.

@fabricionarcizo What version of libtorch were you using? My guess is that your desktop version of PyTorch didn’t match the version of your libtorch. This will affect how your torchscript model is generated. You can verify the version by typing the command below
torch.version.__version__

@xta0 I have used the pytorch 1.3.1 and libtorch 1.3.1. I generated the model in Python, and used torch.jit.trace and torch.jit.save to save the model in a .pt file. I’m able to load the model using torch::jit::load method in the C++ code. However, the same method (torch::jit::load) doesn’t work in the Objective-C version (also 1.3.1).

@fabricionarcizo Is it OK to paste your python code here (or somewhere I can see)? So that I can debug. Because from your description, I’ve no idea of what could go wrong.

@xta0 Thank you for the update on NNPACK. It’s alright, I understand that porting NNPACK is a big project. I’ll be patient for any updates.

BTW, congratulations on the 1.4 release. It fixed some of the problems I was going to ask from 1.3.1.

I wanted to ask if on device training and/or Swift and Objective-C API will be supported in the next release? Or at least is on the 1.5 branch. I see that the podspec on the pytorch github repo is at 1.5 (https://github.com/pytorch/pytorch/blob/master/ios/LibTorch.podspec). Is there any way we can work on that version?

@mark_jimenez Thanks. We have teams working on enabling on-deivce training, but I’m not sure if that can be released in 1.5.0. As for the API wrappers, we have a proposal internally - https://github.com/pytorch/pytorch/pull/25541. We’ve been proactively collecting feedbacks from communities, but haven;t decided when to release it, so feel free to submit ideas, proposals, etc. The 1.5.0 version in .podspec is just a placeholder, nothing particular has been done on that branch.

@xta0 Thank you for the update!

We’re working on a workaround right now for training in mobile by updating the weights of the model by ourself in torchscript. We can’t get it to work on libtorch 1.4 on iOS (not sure on Android). I think it may have something to do with how autograd is implemented on mobile.

For example here’s the python code compiled in PyTorch 1.4.

class TestModule(torch.nn.Module):
    def forward(self, x, y):
        z = x + y
        L = z.sum()
        L.backward()
        return x.grad, y.grad

model = torch.jit.script(TestModule(), torch.tensor([1.]), torch.tensor([1.])) 

And in iOS, the model is loaded like this


    torch::jit::script::Module testModel = torch::jit::load(testFilePath.UTF8String);

    auto result = testModel.forward({torch::rand({1}, torch::TensorOptions().requires_grad(true)), torch::rand({1}, torch::TensorOptions().requires_grad(true))});

I get the same error as the one you filed here: https://github.com/pytorch/pytorch/pull/30067

So reading that, I guess autograd isn’t implemented for the mobile builds? Is that still the case for libtorch 1.4?

@mark_jimenez you’re right. The autograd is not available on mobile so far.