Any PyTorch function can work as Keras' Timedistributed?

HONGYUAN_ZHU · March 25, 2017, 10:35am

Hi! I used to be a Keras user, I want to port my functions to PyTorch. Recently I work on a video classification problem, which uses a similar architecture as LRCN (http://jeffdonahue.com/lrcn/), which applys CNN to extract features from each frame, then use LSTM for classification. In Keras, there is a timedistributed function (https://keras.io/layers/wrappers/) which can apply a layer to each temporal slice, I wonder PyTorch has similar implementations or how I can achieve similar function in this case? Any existing PyTorch example for it?

Thanks in advance for your patience and help!!

tom · March 26, 2017, 8:58am

Hello @HONGYUAN_ZHU,

from the top of my head, I think that the model in Sean Naren’s deepspeech.pytorch does something very similar to what you want to achieve with the SequenceWise class:

github.com

SeanNaren/deepspeech.pytorch/blob/master/model.py

import math
from collections import OrderedDict

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.parameter import Parameter
from torch.autograd import Variable

supported_rnns = {
    'lstm': nn.LSTM,
    'rnn': nn.RNN,
    'gru': nn.GRU
}
supported_rnns_inv = dict((v, k) for k, v in supported_rnns.items())


class SequenceWise(nn.Module):
    def __init__(self, module):
        """

This file has been truncated. show original

Best regards

Thomas

HONGYUAN_ZHU · March 28, 2017, 12:27am

Hi, Tom. Thanks for your sharing! I’ll try to look into that!

Bests

HY

miguelvr · March 28, 2017, 10:24pm

Hey,

I developed a PyTorch module that mimics the TimeDistributed wrapper of Keras a few days ago:

import torch.nn as nn


class TimeDistributed(nn.Module):
    def __init__(self, module, batch_first=False):
        super(TimeDistributed, self).__init__()
        self.module = module
        self.batch_first = batch_first

    def forward(self, x):

        if len(x.size()) <= 2:
            return self.module(x)

        # Squash samples and timesteps into a single axis
        x_reshape = x.contiguous().view(-1, x.size(-1))  # (samples * timesteps, input_size)

        y = self.module(x_reshape)

        # We have to reshape Y
        if self.batch_first:
            y = y.contiguous().view(x.size(0), -1, y.size(-1))  # (samples, timesteps, output_size)
        else:
            y = y.view(-1, x.size(1), y.size(-1))  # (timesteps, samples, output_size)

        return y

HONGYUAN_ZHU · March 29, 2017, 1:44pm

Wow, cool! That’s pretty awwwwwesome!!!

Jacky_Liu · March 25, 2018, 7:42am

Could you give me some example on how to use this function to construct time distributed cnn + lstm?

Several images will be computed by CNN and feed to LSTM all together.

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        #x = F.relu(self.fc1(x))
        #x = F.dropout(x, training=self.training)
        #x = self.fc2(x)
        #return F.log_softmax(x, dim=1)
        return x


class Combine(nn.Module):
    def __init__(self):
        super(Combine, self).__init__()
        self.cnn = CNN()
        self.rnn = nn.LSTM(320, 10, 2)

    def forward(self, x):
        x = self.cnn(x)
        x = self.rnn(x)
        return F.log_softmax(x, dim=1)

Mac_Yeh · July 18, 2018, 5:51am

@miguelvr
Thanks for the sharing, I was thinking to loop the function, your implementation reminds me we are in OO environment; thanks a lot ~~~~~~

miguelvr · July 18, 2018, 7:51am

For most cases, this function is not needed anymore. The Dense layer now supports 3 dimensional inputs, for example.

Mac_Yeh · July 20, 2018, 9:20pm

@miguelvr you are right, right now the linear layer supports 3 dimensional inputs; thanks

kenfehling · September 12, 2018, 9:51pm

Is putting a Dense layer after an RNN the same as applying a Dense layer to each time step though? Like in the first case don’t the time steps connect and mix together?

Kota_Mori · October 19, 2018, 1:47pm

@miguelvr Isn’t this still useful for other layers than Linear though? For example, the input tensor is of shape [sample, frame, image], like video, and you may want to apply a convnet module for each time frame. Please kindly correct me if I get this wrong.

miguelvr · October 19, 2018, 5:57pm

Yes definitely, it still can be useful for other cases

satheesh · March 13, 2019, 10:23pm

thanks. I was looking for the timedistributed equivalent in pytorch and found your code…

Thiyagu · May 13, 2019, 7:03am

Hi Miguelvr,

We have been using Time distributed layer that is developed by you.
I declared the Time distributed layer as follows :
1. Declared linear layer then give that output to the time distributed layer in the module
class CRNN(nn.Module):
def init(self):
super(CRNN, self).init()
# 1D CovNet for learning the Spectral features
self.conv1 = nn.Conv1d(in_channels=1, out_channels=128, kernel_size=(32,))
self.bn1 = nn.BatchNorm1d(128)
self.maxpool1 = nn.MaxPool1d(kernel_size=1, stride=97)
self.dropout1 = nn.Dropout(0.3)
# 1D LSTM for learning the temporal aggregation
self.lstm = nn.LSTM(input_size=128, hidden_size=128, num_layers=2, dropout=0.3)
# Fully Connected layer
#self.fc3 = nn.Linear(128, 128)
#self.bn3 = nn.BatchNorm1d(128)
# Get posterior probability for target event class
self.fc4 = nn.Linear(128, 1)
self.timedist = TimeDistributed(self.fc4)

But my doubt is When I the print the weight parameters of NN.

Time Distributor layer prints two times as follows

fc4.weight torch.Size([1, 128])

fc4.bias torch.Size([1])

timedist.module.weight torch.Size([1, 128])

timedist.module.bias torch.Size([1])

is it correct or any mistakes in the implementation.

Thanks

miguelvr · May 13, 2019, 7:25am

Every nn.Linear object had a weight and a bias, so that’s correct

Thiyagu · May 14, 2019, 11:54am

Thank you for your reply

hash-ir · October 10, 2019, 1:20pm

Can you provide a small working example where this works? I have an input of the shape (samples, timesteps, channels, width, height). With your code, it combines all the dimensions except the last one which becomes input size as per your x_reshape. Then, it doesn’t work with any of the layers, giving a size mismatch error.

akashs · December 1, 2019, 5:09am

Thanks a lot for your nice explanation. I have a novice confusion: as batch samples and timesteps are squashed, won’t it have any problem in LSTM sequential learning? i.e when the sequence is reshaped to (samples, timesteps, output_size), will it retain the sequential (timesteps) features ordering for each sample as it was before squashing?

barloccia · January 22, 2020, 4:27pm

Did you resolve about the structure of your network on PyTorch? I am facing exactly the same problem and I am wondering if you can share the code of the network. I have to develop a CNN+LSTM network for video sequence classification.

IliasPap · January 23, 2020, 12:15pm

2d CNN accepts 4d inputs only so you can pass the 5d tensor ( batch , timesteps , channles ,height ,width) as 4d tensor

view(batch*timesteps,c,h,w)