Bug
To Reproduce
Steps to reproduce the behavior:
I followed this tutorial and confirmed that MobileNetv2 with Metal backend runs correctly on my phone.
This is the code I used to export a PyTorch model with Metal backend.
import torch
import torch.nn as nn
import torch.utils.mobile_optimizer as mobile_optimizer
import torch.nn.functional as F
class Demo(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
x = F.interpolate(x, scale_factor=0.25, mode='bilinear')
return x
model = Demo()
model = torch.quantization.convert(model)
model = torch.jit.script(model)
model = mobile_optimizer.optimize_for_mobile(model, backend='Metal')
model._save_for_lite_interpreter('model.ptl')
x = torch.rand((1, 3, 256, 256))
out = model(x)
print(out.shape)
In Xcode I used the following code:
c10::InferenceMode mode;
at::Tensor tensor = torch::rand({1, 3, 256, 256}, at::kFloat).metal();
auto outputTensor = _impl.forward({tensor}).toTensor().cpu();
std::cout << outputTensor << std::endl;
I got the following stack traces:
2021-08-06 11:14:27.162164+0900 HelloWorld[1919:810455] Metal GPU Frame Capture Enabled
2021-08-06 11:14:27.162400+0900 HelloWorld[1919:810455] Metal API Validation Enabled
2021-08-06 11:14:27.544492+0900 HelloWorld[1919:810455] 'memory_format' argument is incompatible with Metal tensor
Debug info for handle, -1, not found.
Exception raised from empty at /path/pytorch_metal/aten/src/ATen/native/metal/MetalAten.mm:84 (most recent call first):
frame #0: _ZN3c106detail14torchCheckFailEPKcS2_jS2_ + 92 (0x10320a224 in HelloWorld)
frame #1: _ZN2at6native5metal5emptyEN3c108ArrayRefIxEENS2_8optionalINS2_10ScalarTypeEEENS5_INS2_6LayoutEEENS5_INS2_6DeviceEEENS5_IbEENS5_INS2_12MemoryFormatEEE + 284 (0x10303e948 in HelloWorld)
frame #2: _ZN2at12_GLOBAL__N_119empty_memory_formatEN3c108ArrayRefIxEENS1_8optionalINS1_10ScalarTypeEEENS4_INS1_6LayoutEEENS4_INS1_6DeviceEEENS4_IbEENS4_INS1_12MemoryFormatEEE + 220 (0x1025f3c10 in HelloWorld)
frame #3: _ZNK3c1010Dispatcher4callIN2at6TensorEJNS_8ArrayRefIxEENS_8optionalINS_10ScalarTypeEEENS6_INS_6LayoutEEENS6_INS_6DeviceEEENS6_IbEENS6_INS_12MemoryFormatEEEEEET_RKNS_19TypedOperatorHandleIFSG_DpT0_EEESJ_ + 220 (0x1024fe25c in HelloWorld)
frame #4: _ZN2at4_ops19empty_memory_format4callEN3c108ArrayRefIxEENS2_8optionalINS2_10ScalarTypeEEENS5_INS2_6LayoutEEENS5_INS2_6DeviceEEENS5_IbEENS5_INS2_12MemoryFormatEEE + 144 (0x1023c4ab0 in HelloWorld)
frame #5: _ZN2at6native15constant_pad_ndERKNS_6TensorEN3c108ArrayRefIxEERKNS4_6ScalarE + 1148 (0x102e032cc in HelloWorld)
frame #6: _ZN3c104impl34call_functor_with_args_from_stack_INS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorERKS5_NS_8ArrayRefIxEERKNS_6ScalarEES5_NS_4guts8typelist8typelistIJS7_S9_SC_EEEEELb0EJLm0ELm1ELm2EEJS7_S9_SC_EEENSt3__15decayINSF_21infer_function_traitsIT_E4type11return_typeEE4typeEPNS_14OperatorKernelENS_14DispatchKeySetEPNSK_6vectorINS_6IValueENSK_9allocatorISW_EEEENSK_16integer_sequenceImJXspT1_EEEEPNSH_IJDpT2_EEE + 136 (0x102722250 in HelloWorld)
frame #7: _ZN3c104impl31make_boxed_from_unboxed_functorINS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorERKS5_NS_8ArrayRefIxEERKNS_6ScalarEES5_NS_4guts8typelist8typelistIJS7_S9_SC_EEEEELb0EE4callEPNS_14OperatorKernelERKNS_14OperatorHandleENS_14DispatchKeySetEPNSt3__16vectorINS_6IValueENSR_9allocatorIST_EEEE + 40 (0x102722164 in HelloWorld)
frame #8: _ZNK3c1010Dispatcher9callBoxedERKNS_14OperatorHandleEPNSt3__16vectorINS_6IValueENS4_9allocatorIS6_EEEE + 128 (0x1031092ec in HelloWorld)
frame #9: _ZN5torch3jit6mobile16InterpreterState3runERNSt3__16vectorIN3c106IValueENS3_9allocatorIS6_EEEE + 4056 (0x103115bfc in HelloWorld)
frame #10: _ZNK5torch3jit6mobile8Function3runERNSt3__16vectorIN3c106IValueENS3_9allocatorIS6_EEEE + 192 (0x103107ea8 in HelloWorld)
frame #11: _ZNK5torch3jit6mobile6Method3runERNSt3__16vectorIN3c106IValueENS3_9allocatorIS6_EEEE + 516 (0x10311a76c in HelloWorld)
frame #12: _ZNK5torch3jit6mobile6MethodclENSt3__16vectorIN3c106IValueENS3_9allocatorIS6_EEEE + 24 (0x10311b350 in HelloWorld)
frame #13: _ZN5torch3jit6mobile6Module7forwardENSt3__16vectorIN3c106IValueENS3_9allocatorIS6_EEEE + 172 (0x102378984 in HelloWorld)
frame #14: -[TorchModule + + (0x102378328 in HelloWorld)
frame #15: $s10HelloWorld14ViewControllerC11viewDidLoadyyF + 1660 (0x1023886d4 in HelloWorld)
frame #16: $s10HelloWorld14ViewControllerC11viewDidLoadyyFTo + 32 (0x1023891dc in HelloWorld)
frame #17: 186F3A78-108A-3057-A67E-800A88EBFF00 + 4611712 (0x1845b0e80 in UIKitCore)
frame #18: 186F3A78-108A-3057-A67E-800A88EBFF00 + 4629560 (0x1845b5438 in UIKitCore)
frame #19: 186F3A78-108A-3057-A67E-800A88EBFF00 + 3874620 (0x1844fcf3c in UIKitCore)
frame #20: 186F3A78-108A-3057-A67E-800A88EBFF00 + 3875400 (0x1844fd248 in UIKitCore)
frame #21: 186F3A78-108A-3057-A67E-800A88EBFF00 + 3879180 (0x1844fe10c in UIKitCore)
frame #22: 186F3A78-108A-3057-A67E-800A88EBFF00 + 3884176 (0x1844ff490 in UIKitCore)
frame #23: 186F3A78-108A-3057-A67E-800A88EBFF00 + 3765444 (0x1844e24c4 in UIKitCore)
frame #24: 186F3A78-108A-3057-A67E-800A88EBFF00 + 16971476 (0x18517a6d4 in UIKitCore)
frame #25: CC806D5A-7150-373C-9CAA-1507F0A58DF1 + 1434660 (0x1855f0424 in QuartzCore)
frame #26: CC806D5A-7150-373C-9CAA-1507F0A58DF1 + 1461164 (0x1855f6bac in QuartzCore)
frame #27: CC806D5A-7150-373C-9CAA-1507F0A58DF1 + 1507692 (0x18560216c in QuartzCore)
frame #28: CC806D5A-7150-373C-9CAA-1507F0A58DF1 + 755064 (0x18554a578 in QuartzCore)
frame #29: CC806D5A-7150-373C-9CAA-1507F0A58DF1 + 930504 (0x1855752c8 in QuartzCore)
frame #30: 186F3A78-108A-3057-A67E-800A88EBFF00 + 11851464 (0x184c986c8 in UIKitCore)
frame #31: 4D6DD6DD-22E4-3858-9A0C-3CB77C2F13D6 + 632400 (0x182356650 in CoreFoundation)
frame #32: 4D6DD6DD-22E4-3858-9A0C-3CB77C2F13D6 + 628964 (0x1823558e4 in CoreFoundation)
frame #33: 4D6DD6DD-22E4-3858-9A0C-3CB77C2F13D6 + 606324 (0x182350074 in CoreFoundation)
frame #34: CFRunLoopRunSpecific + 572 (0x18234f818 in CoreFoundation)
frame #35: GSEventRunModal + 160 (0x198a55570 in GraphicsServices)
frame #36: 186F3A78-108A-3057-A67E-800A88EBFF00 + 11731176 (0x184c7b0e8 in UIKitCore)
frame #37: UIApplicationMain + 164 (0x184c80664 in UIKitCore)
frame #38: main + 84 (0x10238d1cc in HelloWorld)
frame #39: 5FFFB964-39D6-3CCF-BD34-C6CA4A148D1A + 4416 (0x18202e140 in libdyld.dylib)
Expected behavior
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
PyTorch version: 1.10.0a0+git512448a
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 11.5 (x86_64)
GCC version: Could not collect
Clang version: 12.0.5 (clang-1205.0.22.11)
CMake version: version 3.19.6
Libc version: N/A
Python version: 3.9.5 (default, May 18 2021, 12:31:01) [Clang 10.0.0 ] (64-bit runtime)
Python platform: macOS-10.16-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.10.0a0+git512448a
[conda] blas 1.0 mkl
[conda] mkl 2021.2.0 hecd8cb5_269
[conda] mkl-include 2021.2.0 hecd8cb5_269
[conda] mkl-service 2.3.0 py39h9ed2024_1
[conda] mkl_fft 1.3.0 py39h4a7008c_2
[conda] mkl_random 1.2.1 py39hb2f4e1b_2
[conda] numpy 1.20.2 py39h4b4dc7a_0
[conda] numpy-base 1.20.2 py39he0bd621_0
[conda] torch 1.10.0a0+git512448a dev_0