Incomprehensible behaviour

Hello,

I’ve tried to use custom model on Android but forward fails with error I unable to understand.

I’ve scripted my model using TorchScript annotation method and now I am trying to preform a forward on mobile.

Model looks something like that:

config = ...
class WrapRPN(nn.Module):
    def __init__(self):
        super().__init__()
        self.rpn = RPN(config).eval().cpu()
    def forward(self, features):
        # type: (Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]
        mock_input : InputClass = InputClass(torch.rand((N, 320, 320)))
        instances = self.rpn(mock_input, features)
        output : Dict[str, torch.Tensor] = {}
        for idx in range(len(instances)):
            inst : Instances = instances[idx]
            box_tensor : torch.Tensor = inst.proposal_boxes.tensor
            output[str(idx)] = box_tensor
        return output

It has been converted and loaded to mobile, but fails on runtime

E/AndroidRuntime: FATAL EXCEPTION: main
    Process: org.pytorch.helloworld, PID: 20157
    java.lang.RuntimeException: Unable to start activity ComponentInfo{org.pytorch.helloworld/org.pytorch.helloworld.MainActivity}: com.facebook.jni.CppException: forward() Expected a value of type 'Dict[str, Tensor]' for argument 'features' but instead found type 'Dict[str, Tensor]'.
    Position: 1
    Declaration: forward(ClassType<WrapRPN> self, Dict(str, Tensor) features) -> (Dict(str, Tensor)) (checkArg at ../aten/src/ATen/core/function_schema_inl.h:194)
    (no backtrace available)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3784)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3955)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:91)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:149)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:103)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2392)
        at android.os.Handler.dispatchMessage(Handler.java:107)
        at android.os.Looper.loop(Looper.java:213)
        at android.app.ActivityThread.main(ActivityThread.java:8147)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:513)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1100)
     Caused by: com.facebook.jni.CppException: forward() Expected a value of type 'Dict[str, Tensor]' for argument 'features' but instead found type 'Dict[str, Tensor]'.
    Position: 1
    Declaration: forward(ClassType<WrapRPN> self, Dict(str, Tensor) features) -> (Dict(str, Tensor)) (checkArg at ../aten/src/ATen/core/function_schema_inl.h:194)
    (no backtrace available)
        at org.pytorch.NativePeer.forward(Native Method)
        at org.pytorch.Module.forward(Module.java:37)
        at org.pytorch.helloworld.MainActivity.onCreate(MainActivity.java:66)
        at android.app.Activity.performCreate(Activity.java:8068)
        at android.app.Activity.performCreate(Activity.java:8056)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1320)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3757)
        	... 11 more

Java code is following:

    final Tensor inputTensor = TensorImageUtils.bitmapToFloat32Tensor(bitmap,
        TensorImageUtils.TORCHVISION_NORM_MEAN_RGB, TensorImageUtils.TORCHVISION_NORM_STD_RGB);

    Map<String, IValue> hm = new HashMap<String, IValue>();
    List<String> keys = Arrays.asList("p2", "p3", "p4", "p5", "p6");;
    for (String key : keys) {
      hm.put(key, IValue.from(inputTensor));
    }
    final IValue rpn_input = IValue.dictStringKeyFrom(hm);
    module.forward(rpn_input);

What does it mean?
Expected a value of type 'Dict[str, Tensor]' for argument 'features' but instead found type 'Dict[str, Tensor]' :exploding_head:

So, I’ve spent a little time to understand what is going on and to find a way to overcome this issue.

What I’ve noticed:

  1. If we use constructions such as IValue.dictStringKeyFrom(hm), where hm is HashMap<String, IValue> or if we use IValue.listFrom(lst), where lst is List<IValue> we will obtain the behaviour described above.
  2. But if we use IValue.from(arr), where arr of type Tensor[], we will not face this issue and JAVA won’t tell that it’s expected List[Tensor] but got List[Tensor]

Concluding this, there’s no way to construct analogue of (2) using dictionaries.
IValue (https://pytorch.org/docs/stable/org/pytorch/IValue.html) doesn’t have overriden method public static IValue dictStringKeyFrom(Map<String, T> map), where T is Tensor.
I think in this case it might work, but it does not deny the fact there might be a bug

Thanks for this finding.

I reproduced it locally and debugging it. List[Tensor] will be represented as a separate type on libtorch IValue side.

Looks like we have some unexpected behavior on jni with Dict types which is converted to libtorch IValue{GenericDict}}.

1 Like

Hello @zetyquickly,

Thanks one more time for this finding.

It happened as in jni tensorType was deduced from the first entry value of dictionary, including shape, requires_grad etc.
While torchscript function did not have it. Dict is not covariant, so the subtype check required equal KeyType(str) and ValueType(TensorType0).
TensorType’s of function argument and provided value were different and typecheck failed.
Error message did not include that additional information about TensorType.

This problem was fixed on android-jni level in commit:


It was merged in master recently.

Separate issue for more detailed error messages in TensorType checks.

Android nightlies (snapshots) are already republished with this fix, the error should not happen with them.
To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)

repositories {
    maven {
        url "https://oss.sonatype.org/content/repositories/snapshots"
    }
}

dependencies {
    ...
    implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT'
    implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT'
    ...
}
1 Like

Thanks a lot for the FIX! Tell me please, does it also affect also behviour of List<IValue>? When I tried it also fails with this “List not a List” error

1 Like

Yes, that problem affected both our Generic containers (Dict and List) when the element type was TensorType. After the fix GenericList is also initialized with c10::unshapedType(firstElement) which should fix the problem like ‘List[Tensor] is not List[Tensor]’

1 Like