How to pretrain/select CNN for Biomedical Video Analysis

Hi everyone

I am trying to train a model in the biomedical domain with a rather specialized task (flow prediction).
My input consists of video clips and I would like to predict either a single image or a video. I have pixel wise ground truth for every frame of the video but the number of videos is very limited.

Architecture wise I am considering a CNN, RNN combination where the CNN provides a representation of the input frames for the RNN to learn about the temporal relationship between input frames.


Now my question is: What kind of CNN do I use and on what do I pretrain it? Since I am working with biomedical data I would assume image net as well as most other image data sets do not really help as the image content is very different. Are there data sets/tasks/networks I could use for this purpose?

What kind of medical images are you dealing with?
Since you are dealing with flow I would assume some kind of US doppler?
Depending on your image modality you might find some open source data sources.
If your use case is academic it might be even simpler, as not so many are on CC0.

How many videos / images do you have?
Maybe you could indeed start with a standard pre-trained model and fine tune through all layers, if you have enough images.

1 Like

The input is pure RGB video of blood vessels captured by a microscope.
In terms of data set I currently have about 50 clips of around a minute of varying quality.

Do you think a pretrained model would adjust to the very different domain?

Hi masus04, I have previously worked on medical image analysis and was recently able to use imagenet pretrained networks available in pytorch on ultrasound images and get really good results. I didn’t freeze any layer and experimented with a few hyperparameters. Overall, I think it would be a good idea to try out these pretrained models and then move to your own domain specific pretrained models, if the former does not work.

1 Like

Hi Mazhar
So you loaded a pretrained imagenet CNN and used some hidden layer’s output for your representation?
I’ll definitely try it, thank you.

Hi, I can confirm what @Mazhar_Shaikh said. I worked with cell images (pretty far from ImageNet as well) and pre trained model on Image net improved a lot my results. As Mazhar said, I did not freeze the layers and just used the pre trained weights as initialization for my own training for the corresponding section of my network (encoder of a Unet).

Good luck

1 Like