I’m exporting a model to ONNX for use in OpenCV, and have had to avoid a few ops that aren’t supported.
A previous version of my model exports to ONNX successfully, but after making some changes, I am now getting the following error:
RuntimeError: Exporting the operator resolve_conj to ONNX opset version 11 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.
I’m not using the
resolve_conj() op explicitly in my code, so I assume it’s inserted during JIT, or it’s a sub operation inside another op. My understanding is that this op is used to resolve a conjugate view to a tangible tensor, so my guess it’s related to one of the
permute() ops I am using.
How can I figure out where its been inserted in the graph, and how can I link that back to my actual code so I can make the appropriate changes to allow ONNX export?
If the error is raised in the PyTorch backend (not ONNXRuntime or so), you might be able to use:
to get a better stacktrace.
Thanks for the reply, but the error doesn’t appear to be in the C++ backend. I should have been more explicit but I tried to leave out the extraneous details.
The error is being raised in Python when the op can’t be found, after I call
torch.onnx.export(). Here’s the abridged trace:
torch.onnx.export(opencv_encoder, image, opencv_model_output, opset_version=11, verbose=True, input_names=["input"], output_names=["patch_scores"])#, "image_score"])
opencv_model = onnx.load(opencv_model_output)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\__init__.py:316, in export(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:107, in export(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:724, in _export(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:497, in _model_to_graph(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:216, in _optimize_graph(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\__init__.py:373, in _run_symbolic_function(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:1028, in _run_symbolic_function(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\utils.py:982, in _find_symbolic_in_registry(...)
File ~\.virtualenvs\pytorch-jk_rFARN\lib\site-packages\torch\onnx\symbolic_registry.py:125, in get_registered_op(...)
It looks like it’s failing because there’s a
resolve_conj() call somewhere and exporting that op to ONNX hasn’t been implemented. Is that the right understanding here, or do you think that I’ve hit a more significant bug somewhere?
I certainly expect that some ops won’t work with ONNX, but I’m a bit stuck on how to go about finding where the op is coming from, aside from simply adding and removing ops until the export succeeds.
For posterity here’s the environment I’m currently testing in:
- Windows 10 20H2
- Python 3.9.6
- PyTorch 1.10.2+cpu installed via Pipenv 2021.5.29
No, I think your assumption in correct and I would expect to see the failing on in the stacktrace.
I.e. are function calls shown in the
_run_symbolic_function or any other failed call?
Each call should point to a line of code and I would hope the failing operation should be showed there too (with its call history).
The traces I was getting don’t have parameters in them, and in retrospect I probably should have just used the debugger to step through the symbolic function parsing, but I was thinking there might be an obvious way to deal with this.
I’ve figured out where the problem was though. Because I was debugging a workaround, I had thoroughly peppered my code with
print() statements, and it appears like printing a tensor slice invokes
resolve_conj() on the tensor.
Here’s a minimal reproducible example:
from torch import nn
def forward(self, x):
x_slice = x[:, 0]
if __name__ == "__main__":
m = BadFirst().eval()
x = torch.rand(10, 5)
res = m(x) # this works
torch.onnx.export(m, x, "badfirst.onnx") # this doesn't
I assumed the JIT trace would end up trimming those branches from the graph with a backward pass before passing the result to the ONNX optimiser, but it seems like that isn’t the case.
Anyway, thanks for the sanity check. I’ll check the repo to see if anyone has mentioned this previously and I’ll open an issue if not.