Instance segmentation with torchvision

AndriPi · February 25, 2021, 2:45pm

Hi,

I would like to quickly build an instance segmentation model on a dataset I received, and I would like to try torchvision out, since it looks like the most user-friendly CV framework in PyTorch. Which models are available for the instance segmentation task in torchvision? It looks like the only one is

https://pytorch.org/vision/stable/models.html#mask-r-cnn

Is there a tutorial to use Mask-RCNN in torchvision? In particular, I need to know how to build the DataLoader. Which format must the annotations have?

KFrank · February 25, 2021, 3:32pm

Hi Andri!

The TorchVision Object Detection Finetuning Tutorial should give you
what you need.

From memory, it’s a bit imperfect – maybe a little out of date with some
minor inconsistencies – but torchvision’s maskrcnn_resnet50_fpn
does work, although you will have to do some some relatively
straightforward work to adapt it to your use case.

Note that you get a pre-trained version “for free,” so you need a lot less
data than if you were training it from scratch.

Best.

K. Frank

AndriPi · February 25, 2021, 7:01pm

Hi K. Frank!

Thanks a lot for the pointer. I saw the tutorial, but given the title, I thought it would be on object detection, not on instance segmentation, and I didn’t read it. I’ll dig deep into the Colab version. Before I try that, however, are there any fundamentals I should learn? I have no previous experience with torchvision and PyTorch, but I have some experience with fastai, and quite a bit of experience with Keras and Tensorflow, so I don’t exactly start from scratch. What I did up to now was to pull the NGC Docker image for PyTorch, and spin up the container. This is the beginning of my PyTorch journey

KFrank · February 25, 2021, 7:20pm

Hi Andri!

At a general conceptual level, pytorch and kera/tensorflow do pretty
much the same thing, so that’s a big part of what you need to learn
and understand. There are, however, significant differences in the
architecture and the details of implementation.

You could just jump into the deep end with Mask-RCNN, and work
through the documentation and ask questions here when issues arise.

If you like tutorials, you could work though pytorch’s A 60 Minute Blitz
tutorial (and other tutorials and introductory documentation). Whether
or not you follow a tutorial, I would suggest that you build and train a
simple toy-problem network in pytorch – just to get a feel for the
pytorch way of doing things – before jumping straight to your real
Mask-RCNN use case.

Best.

K. Frank