Hello,
Being one of the early adopters and advocated of PyTorch, I have the strange feeling that Libtorch is, to some extent neglected. This is evident in the level of activity and chatter in the C++ sub-forum.
I find that it is very hard to get answers to the simplest questions, such as: how to work in batches during inference / and or training for ImageNet like datasets? How to utilise multiple GPU’s during training and inference of traced modules? Where can I find an example for loading an ImageNet like dataset NOT Mnist? and much more.
All these are trivial in PyTorch/ Python and seem almost impossible in Libtorch.
References to my questions:
Is there a formal policy that allocates more resources to PyTorch related questions rather than Libtorch? If not, what is the explanation for the lack of documentation and infrequent answers in the forums?
I think the simple answer is that almost all people answering questions are doing it in their free time and answer mostly the questions they are comfortable with. And I don’t think any of the very active people here actually use libtorch.
At least this is the reason why I don’t usually answer such question unless it is one of the few things i touched in libtorch.
Maybe @smth can give a higher level answer on this? But the day to day answer I’m sure is the one above.
Sure, this makes a lot of sense, however the severe lack of documentation and examples make it very hard to progress with projects based on C++. For instance, the lack of an example for such a basic flow, to load a batch of images using an imagenet style dataloader (or without) during inference and then processing the tensors that are returned.
Thanks for bringing up this issue. I think your characterization that LibTorch is neglected, at least in comparison to areas like the Python Frontend and TorchScript, is very fair. We certainly don’t have a policy that allocates more resources to some questions over others; as @albanD suggests, we rely on the expertise of the community and our internal developers in order to answer questions, and that naturally (but unfortunately) skews away from LibTorch currently.
Let me talk about our current investment in LibTorch / C++ API and what we plan to change. By far our largest current investment in this area is with the Variable/Tensor merge in C++ – you may have seen https://github.com/pytorch/pytorch/pull/17072, for example. We decided to invest in this way because: 1) it was the largest complaint we heard about the C++ API, particularly from our internal power users, 2) it unblocks performance gains that will affect all of PyTorch (python, C++, JIT). However, this has unfortunately crowded out other investments in LibTorch/ C++, as you noted.
Our plan to remedy this is two-fold:
The Variable/Tensor merge project is a one-time cost, so we will be able to naturally shift attention to LibTorch / C++ more generally once it is complete (should be soon).
We are also ramping up more people on LibTorch / C++ internally, so we can provide better and broader coverage of issues.
Again, thanks for bringing this up, and hopefully this gives you some insight into our process and investments in this area.
I just answered your second question, on how to forward batches.
Concerning your third questoin, you can find an example of a custom dataset here and I will try to include an explanation for how to load images with OpenCV.
now here is a little classifier that classifies apples and bananas with a custom data loader. I will polish the readme soon so people will know how to use it.
But then, we can help the C++ lib a lot building on his (and the others) great conceptual work by fleshing out the details - e.g. bidirectional RNNs, anyone?
Similarly the infrastructure for C++ docs is all there, so anyone can contribute.
I’m can’t fill in for even half a Peter but I guess I know a thing or two about hacking on PyTorch and I’d be happy to (lightly) mentor one or two people willing to work on things. Do tag me on a issue/PR or send mail.