Implement a 3D custom model

robsver · May 3, 2023, 9:46am

Hi everyone, I would like to implement a 3D, single channel, PyTorch model having two 3D convolutions, followed by two linear layers, independently from the batch size and shape of the input volume.

Any suggestions?

Thanks

J_Johnson · May 3, 2023, 10:27am

Something like this is similar to what you’re asking:

github.com

xmuyzz/3D-CNN-PyTorch/blob/master/models/C3DNet.py

import math
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F
from torch.autograd import Variable
from functools import partial


class C3D(nn.Module):

    """
    This is the c3d implementation with batch norm.

    [1] Tran, Du, et al. "Learning spatiotemporal features with 3d convolutional networks."
    Proceedings of the IEEE international conference on computer vision. 2015.
    """

    def __init__(self, sample_size, sample_duration, num_classes=600, in_channels=1):

This file has been truncated. show original

Except that has more than 2 channels and some added pooling/dropout/batchnorm.

Should help you on the right track.

robsver · May 3, 2023, 10:45am

Thank you. Only one question what are sample_duration and sample_size ?

J_Johnson · May 3, 2023, 11:17am

Not my code, but that is used for calculating the transition between the final conv layer and the first Linear layer. However, you can just use nn.LazyLinear:
https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html

Just specify the out_features you want and it will figure out the input size during the first run.

robsver · May 3, 2023, 3:46pm

Thanks a lot @J_Johnson !
The LazyLinear was really useful.

I wrote this snippet of code (batch = 2, 1 channel, 130x130x130 image) and I expected to get a tensor having shape (2,3), but I got a (2,64,64,64,3).

What am I missing?
Thanks a lot.

import torch
import torch.nn as nn
import torch.nn.functional as F

class CustomNet(nn.Module):
    def __init__(self):
        super(CustomNet, self).__init__()
        self.seq1 = nn.Sequential(
            nn.Conv3d(in_channels=1, out_channels=64, kernel_size=3, stride=1),
            nn.BatchNorm3d(64),
            nn.ReLU(),
            nn.MaxPool3d(kernel_size=(2,2,2))
        )
        self.l1 = nn.LazyLinear(out_features=64, bias=False)
        self.l2 = nn.Linear(64, 3, bias=False)

    def forward(self, x):
        x1 = self.seq1(x)
        x2 = self.l1(x1)
        x3 = self.l2(x2)

        return x3

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
fake_input = torch.randn((2,1,130,130,130)).to(device) # Batch = 2, 1 channel, 130x130x130 image

model = CustomNet().to(device)

outputs = model(fake_input)

print(outputs.size())

J_Johnson · May 4, 2023, 2:32am

This post has the output size calculation from the docs written as a Python function. You can enter in the arguments on one dimension for your Conv3d and MaxPool3d as well as input size on one dimension to determine what the output size is you should expect from that layer, given a certain input size.

robsver · May 7, 2023, 8:46pm

Thanks @J_Johnson, I really appreciate your reply. I solved by adding a Flatten before the LazyLinear layer to my custom model.

Have a nice day.