How to add recurrent unit to a layer of resnet50?

nikiguo93 · April 8, 2023, 8:03am

Hi, can anybody offer me some tutorials or suggestion about how to add recurrent unit to a specific layer of resnet50?

ptrblck · April 9, 2023, 5:04am

Could you describe what a “recurrent unit” would be?
Do you want to reuse the output of a specific ResNet layer and feed it back as its input inside a loop?
If so, you could directly implement this behavior in the forward method of your resnet model definition.

nikiguo93 · April 9, 2023, 7:32am

Hi, @ ptrblck, yeah that is one of my plan, to give the feedback to the previous layer. But I have no idea whether the feedback part should be trained with the feedforward part together. Do you have tutorials to share? Another plan is to combine resnet50 and lstm, would appreciate if there is any relative tutorial to share. Thanks!

J_Johnson · April 9, 2023, 7:45am

What do you mean by “feedback” and “previous layer”?

The backward pass via the optimizer function provides optimization “feedback” to all layers with requires_grad = True.

Or do you mean you want to pass a state between steps in a time series? That is what recurrent neural networks(RNNs) do: provide some internal(and learnable) context of what’s happened up to that point.

nikiguo93 · April 9, 2023, 8:09am

Hi, actually I don’t have enough knnowledge in reccurence. My plan is to modify a feedforward neural network with recurrence mechanism. I have resnet50, then I wonder whether I can add something like ‘recurrent unit’ to the resnet50. For example, the input to the next layer would also be given back to the layer which output it. If I misunderstood recurrence, correct me. I prefer to start with something easier relative to reccurence architecture. Do you have some tips for me?
Another plan is to use lstm which accordingly is a special kind of reccurent architecture. Then I wonder whether someone can offer some useful tutorials combining resnet50 and lstm. My input to resnet50 is image of size 3224224. After the feature extraction by resnet50, I wonder whether I can use lstm to predict brain signals to these images. That is to say, the input to resnet50 would be image data of size ( batch size * channel * width * width), and the output by the lstm layer would be brain signals of size ( image batch size * channel of brain signal * time points)?

J_Johnson · April 9, 2023, 9:16am

Do you have a label at each timestep? If not, you might be better off using VideoResNet, found here:

github.com

pytorch/vision/blob/main/torchvision/models/video/resnet.py

from functools import partial
from typing import Any, Callable, List, Optional, Sequence, Tuple, Type, Union

import torch.nn as nn
from torch import Tensor

from ...transforms._presets import VideoClassification
from ...utils import _log_api_usage_once
from .._api import register_model, Weights, WeightsEnum
from .._meta import _KINETICS400_CATEGORIES
from .._utils import _ovewrite_named_param, handle_legacy_interface


__all__ = [
    "VideoResNet",
    "R3D_18_Weights",
    "MC3_18_Weights",
    "R2Plus1D_18_Weights",
    "r3d_18",
    "mc3_18",

This file has been truncated. show original

That makes use of 3d convolutions, two dimensions for spatial and one for temporal.

If you have some sort of targets or labels to train on at each time step, you could change the fc layer of Resnet to something like model.fc = nn.LSTM(in_channels, out_channels, batch_first=True) and also reshape the data going into that layer in the forward pass to x = x.unsqueeze(1) and the output to x.squeeze(1).

nikiguo93 · April 9, 2023, 9:42am

Hi, thanks for your information. Yes I have some biological brain signals as ground truth/ label for model training. I would first have a try at the way you mentioned to modify the fc layer of resnet50. Mnay thanks!