I am in particular new to the c++ API interface.
Did anybody manage to create his own, custom, loss function with the API?
Are there predefined regularizers (l2 or similar, better Lipshitz) available? How are those used ?
I could not find any examples on that via google – maybe I overlooked something.
I am happy for any advice,
Have you found out a source for that?
As far as my understanding goes, a loss function basically behaves just as a layer does. Therefore, a custom loss function should be implementable by following Peter’s guide on how to write a custom layer https://pytorch.org/tutorials/advanced/cpp_extension.html
It is necessary that you define the forward as well as the backward pass of that loss function, which means that you need to find the gradient of your custom loss function for the backward pass. However, I am not entirely sure if that is true. Can someone copy that?
So there are two cases regarding the autograd graph:
- If your operations to compute the loss are torch functions, the autograd engine can compute the backward just like in Python.
- If, on the other hand, you implement custom computation kernels (as Peter does in the tutorial), you need to provide a backward. The old-style thing (shown in the tutorial) to do was to export both forward and backward and use an autograd.Function. These days, you would probably create a autograd::Function in C++, similar to what we do in TorchVision.
Another thing to keep in mind is that if you want JIT compatibility, you need to register operators (as in the custom op tutorial). Unless you need funny data types to be passed to from Python, there isn’t really a downside to using these.
thanks Thomas for these valuable insights. Just as a general question, detectron2 got implemented as Peter explained in his tutorial on cpp extensions, https://github.com/facebookresearch/detectron2/tree/master/detectron2/layers
Is there any advantage of the old way over the autograd::Function implementation? Implementing it via the autograd::Function in cpp is JIT compatible as well, right?
So the new way is just very new (we added this to TorchVision in September). They probably branched off the maskrcnn-benchmark or TorchVision earlier than that and didn’t get the update.
Previously, you had the problem that “JIT means CustomOp means implemented in C++” and that meant no autograd with jumping through lots of hoops. In August or so, we gained C++ autograd::Function, so now you can easily have CustomOps that record a custom backward.
This still isn’t the “ultimate JIT-friendly thing”, which would be more like graph-to-graph differentiation, that sadly I didn’t get to finish up yet (I’ll see if adding a voting mechanism for prioritizing my PyTorch work helps…).