The CIFAR10 tutorial might be a good starter, as it’s creating a new model.
The Conv2d docs give you some information about the expected input arguments, such as
For your use case it seems that that first conv layer should be defined as:
conv = nn.Conv2d(in_channels=?, out_channels=16, kernel_size=3)
The following layers would then use
in_channels=16, since this is the number of channels in the output activation from the preceding layer.
It’s fine. Thanks. What about the residual networks?
You could reuse the approach used in
torchvision's resnet implementation as seen here.