How to get the batch dimension right in the forward path of a custom layer

KIC · May 7, 2020, 8:17am

I would like to implement a custom layer but I can’t get the shapes correct because of the batch dimension in the forward path. I however tried to make it exactly the way nn.Linear is implemented. What am I missing here?

import torch as t
import torch.nn as nn
from torch.nn import init


class Time2Vec(nn.Module):
    """
    source: https://towardsdatascience.com/time2vec-for-time-series-features-encoding-a03a4f3f937e
    and:    https://arxiv.org/pdf/1907.05321.pdf
    """

    def __init__(self, input_dim, output_dim):
        super().__init__()
        self.output_dim = output_dim

        self.W = nn.Parameter(t.Tensor(output_dim, output_dim))
        self.B = nn.Parameter(t.Tensor(input_dim, output_dim))
        self.w = nn.Parameter(t.Tensor(1, 1))
        self.b = nn.Parameter(t.Tensor(input_dim, 1))
        self.reset_parameters()

    def reset_parameters(self):
        init.uniform_(self.W, 0, 1)
        init.uniform_(self.B, 0, 1)
        init.uniform_(self.w, 0, 1)
        init.uniform_(self.b, 0, 1)

    def forward(self, x):
        original = self.w * x + self.b
        x = t.repeat_interleave(x, self.output_dim, dim=-1)
        sin_trans = t.sin(t.dot(x, self.W) + self.B)
        return t.cat([sin_trans, original], -1)

And create a module:

 class MyModule(nn.Module):

            def __init__(self, input, output):
                super().__init__()
                self.net = Time2Vec(input, output)

            def forward(self, x):
                return self.net(x)

        t2v = MyModule(3, 3)
        print(t2v(t.from_numpy(np.array([[[0.1], [0.2], [0.3]], [[0.1], [0.2], [0.3]]])).float()))

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 0

ptrblck · May 8, 2020, 3:40am

The first operation will use a parameter of shape [1, 1] and an input of [2, 3]:

self.w * x

So it seems that the feature dimension isn’t even matching.
Could you explain your use case a bit?
self.W has the shape [output_dim, output_dim], which also doesn’t consider the input feature dimension (but isn’t used at all).

KIC · May 8, 2020, 5:12am

Thx for your help!

Sorry, I had to fix the example, it is actually a 3D tensor: print(t2v(t.from_numpy(np.array([[[0.1], [0.2], [0.3]], [[0.1], [0.2], [0.3]]])).float())). And it should fail at the sinus part: sin_trans = t.sin(t.dot(x, self.W) + self.B)

Sure the use case is a Time embedding. Let me quote the paper mentioned in the comment:

In designing a representation for time, we identify three important properties: 1- capturing both
periodic and non-periodic patterns, 2- being invariant to time rescaling, and 3- being simple enough
so it can be combined with many models. In what follows, we provide more detail on these properties.

We propose Time2Vec, a representation for time which has the three identified properties.
For a given scalar notion of time τ , Time2Vec of τ , denoted as t2v(τ ), is a vector of size k + 1
defined as follows:

To match the dimension [output, output] we repeat the vector x output times. So we feed in a single vector and get back a 2D matrix. I have an other implementation in keras/tensorflow and there it works:

class tf_Time2Vec(tf.keras.layers.Layer):
    """
    source: https://towardsdatascience.com/time2vec-for-time-series-features-encoding-a03a4f3f937e
    and:    https://arxiv.org/pdf/1907.05321.pdf
    """

    def __init__(self, output_dim=None, **kwargs):
        self.output_dim = output_dim
        super().__init__(**kwargs)

    def build(self, input_shape):

        self.W = self.add_weight(name='W',
                                 shape=(self.output_dim,
                                        self.output_dim),
                                 initializer='uniform',
                                 trainable=True)

        self.B = self.add_weight(name='B',
                                 shape=(input_shape[1].value,
                                        self.output_dim),
                                 initializer='uniform',
                                 trainable=True)

        self.w = self.add_weight(name='w',
                                 shape=(1, 1),
                                 initializer='uniform',
                                 trainable=True)

        self.b = self.add_weight(name='b',
                                 shape=(input_shape[1].value, 1),
                                 initializer='uniform',
                                 trainable=True)

        super().build(input_shape)

    def call(self, x, **kwargs):
        K = tf.keras.backend

        original = self.w * x + self.b
        x = K.repeat_elements(x, self.output_dim, -1)
        sin_trans = K.sin(K.dot(x, self.W) + self.B)
        return K.concatenate([sin_trans, original], -1)

    def compute_output_shape(self, input_shape):
        return input_shape[0], input_shape[1], self.output_dim +1

KIC · May 16, 2020, 1:03pm

pytorch .dot function is different from tensorflow or numpy

MInh_Nguyen1 · June 27, 2021, 3:56pm

Hi,
I am trying to implement the Time2Vec in Pytorch too and I am running into the same problem. How did you solve it exactly. Thanks