Creating a custom loss function with the c++ api

vogec · February 15, 2019, 1:58pm

Hello all.
I am in particular new to the c++ API interface.
Did anybody manage to create his own, custom, loss function with the API?
Are there predefined regularizers (l2 or similar, better Lipshitz) available? How are those used ?
I could not find any examples on that via google – maybe I overlooked something.

I am happy for any advice,
thanks guys.

goksug · October 19, 2019, 12:14am

Hi there,

Have you found out a source for that?

mhubii · October 24, 2019, 4:05pm

As far as my understanding goes, a loss function basically behaves just as a layer does. Therefore, a custom loss function should be implementable by following Peter’s guide on how to write a custom layer https://pytorch.org/tutorials/advanced/cpp_extension.html

It is necessary that you define the forward as well as the backward pass of that loss function, which means that you need to find the gradient of your custom loss function for the backward pass. However, I am not entirely sure if that is true. Can someone copy that?

tom · October 27, 2019, 7:31pm

So there are two cases regarding the autograd graph:

If your operations to compute the loss are torch functions, the autograd engine can compute the backward just like in Python.
If, on the other hand, you implement custom computation kernels (as Peter does in the tutorial), you need to provide a backward. The old-style thing (shown in the tutorial) to do was to export both forward and backward and use an autograd.Function. These days, you would probably create a autograd::Function in C++, similar to what we do in TorchVision.

Another thing to keep in mind is that if you want JIT compatibility, you need to register operators (as in the custom op tutorial). Unless you need funny data types to be passed to from Python, there isn’t really a downside to using these.

Best regards

Thomas

mhubii · October 28, 2019, 10:48am

thanks Thomas for these valuable insights. Just as a general question, detectron2 got implemented as Peter explained in his tutorial on cpp extensions, https://github.com/facebookresearch/detectron2/tree/master/detectron2/layers

Is there any advantage of the old way over the autograd::Function implementation? Implementing it via the autograd::Function in cpp is JIT compatible as well, right?

tom · October 28, 2019, 2:50pm

So the new way is just very new (we added this to TorchVision in September). They probably branched off the maskrcnn-benchmark or TorchVision earlier than that and didn’t get the update.
Previously, you had the problem that “JIT means CustomOp means implemented in C++” and that meant no autograd with jumping through lots of hoops. In August or so, we gained C++ autograd::Function, so now you can easily have CustomOps that record a custom backward.

This still isn’t the “ultimate JIT-friendly thing”, which would be more like graph-to-graph differentiation, that sadly I didn’t get to finish up yet (I’ll see if adding a voting mechanism for prioritizing my PyTorch work helps…).

Best regards

Thomas