How to define a parametrized tanh activation function with 2 trainable parameters?

renken · September 19, 2022, 6:34pm

Hi, i want to define anactivation function with 2 trainable parameters, k and c, which define the function.
I want my neural net to calibrate those parameters aswell during the training procedure. Do you have an idea on how i can manage to do that in few lines? I am really new on pytorch.

Here is my code for the moment, with fixed values of k and c as you can see…

def transpose_conv(in_channels,out_channels,padding):
    return nn.Sequential(
           nn.ConvTranspose3d(in_channels, out_channels, 4, stride=2, padding=padding, bias=False), 
           nn.BatchNorm3d(out_channels,affine=True),
           nn.ReLU(inplace=True)
           )  
class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.transpose_conv_1 = transpose_conv(hnnc,512,2)
        self.transpose_conv_2 = transpose_conv(512,256,2)
        self.transpose_conv_3 = transpose_conv(256,128,2)
        self.transpose_conv_4 = transpose_conv(128,64,2)
        self.transpose_conv_5 = transpose_conv(64,nc,3)
    def forward(self, x):
        c, k = 0.5, 0.25
        t_c_1  = self.transpose_conv_1(x)
        t_c_2  = self.transpose_conv_2(t_c_1)
        t_c_3  = self.transpose_conv_3(t_c_2)
        t_c_4  = self.transpose_conv_4(t_c_3)
        t_c_5  = self.transpose_conv_5(t_c_4)
        result = 0.5*(torch.tanh((1/k)*(t_c_5-c))+1)
        return result

Thank you in advance

ptrblck · September 19, 2022, 10:07pm

You can create a custom nn.Module and define both parameters as trainable. Something like this could work:

class MyActivation(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Parameter(torch.tensor(0.5))
        self.k = nn.Parameter(torch.tensor(0.25))
        
    def forward(self, x):
        return 0.5*(torch.tanh((1/self.k)*(x-self.c))+1)

Of course you can also randomly initialize both parameters or pass the init value as arguments to MyActivation etc.

renken · September 20, 2022, 5:53am

ptrblck:

class MyActivation(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Parameter(torch.tensor(0.5))
        self.k = nn.Parameter(torch.tensor(0.25))
        
    def forward(self, x):
        return 0.5*(torch.tanh((1/self.k)*(x-self.c))+1)

Hi @ptrblck ,

Thank you for your comment and your code ! I tried this

# defining a trainable tanh function
class Trainable_Tanh(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Parameter(torch.tensor(0.5))
        self.k = nn.Parameter(torch.tensor(0.25))   
    def forward(self, x):
        return 0.5*(torch.tanh((1/self.k)*(x-self.c))+1)

# defining the Generator
def transpose_conv_G(in_channels,out_channels,padding):
    return nn.Sequential(
           nn.ConvTranspose3d(in_channels, out_channels, 4, stride=2, padding=padding, bias=False), 
           nn.BatchNorm3d(out_channels,affine=True),
           nn.ReLU(inplace=True)
           )  
class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.transpose_conv_1 = transpose_conv_G(hyper_nb_noise_channels,512,2)
        self.transpose_conv_2 = transpose_conv_G(512,256,2)
        self.transpose_conv_3 = transpose_conv_G(256,128,2)
        self.transpose_conv_4 = transpose_conv_G(128,64,2)
        self.transpose_conv_5 = transpose_conv_G(64,nc,3)
    def forward(self, x):
        c,k=0.5,0.25
        t_c_1  = self.transpose_conv_1(x)
        t_c_2  = self.transpose_conv_2(t_c_1)
        t_c_3  = self.transpose_conv_3(t_c_2)
        t_c_4  = self.transpose_conv_4(t_c_3)
        t_c_5  = self.transpose_conv_5(t_c_4)
        # calling the trainable tanh
        myTanh = Trainable_Tanh()
        result = myTanh(t_c_5)
        return result

I got 3 last questions please:

is it a right way to call the newly created activation ?
am I supposed to perform somewhere “myTanh.zero_grad()”, “myTanh.eval()” ? or is it already taken into account when i do it with the entire net: netG = Generator().to(device) ; netG.zero_grad()
when i do a torch summary of my final network, i don’t observe 2 more trainable parameters. I have exactly the same number of trainabe parameters as before. It is strange because normally i introduced 2 additionnal parameters. Please could you tell me where i am wrong ?

ptrblck · September 20, 2022, 6:01am

No, since you are re-initializing the trainable activation module in each forward pass (so it won’t actually be trained at all).
Treat this module as any other layer, initialize it in your Generator.__init__ method, and use it in the forward:

class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.myTanh = Trainable_Tanh()
        ...

    def forward(self, x):
        ...
        result = myTanh(t_c_5)
        ...

If you register Trainable_Tanh properly in the Generator.__init__ method, its gradients will be zeroed out from the parent module and also its training flag will be changed from the parent. As mentioned before: just treat it as any other layer such as nn.Linear.

I guess because you are never registering it as mentioned in the previous points.

renken · September 20, 2022, 8:22am

Thanks @ptrblck

I added the trainable tanh as any other layer, but i still don’t have the parameters in my torch summary.

Did i miss something ?

# defining a trainable tanh function
class Trainable_Tanh(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Parameter(torch.tensor(0.5))
        self.k = nn.Parameter(torch.tensor(0.25))
    def forward(self, x):
        return 0.5*(torch.tanh((1/self.k)*(x-self.c))+1)

# defining the Generator
def transpose_conv_G(in_channels,out_channels,padding):
    return nn.Sequential(
           nn.ConvTranspose3d(in_channels, out_channels, 4, stride=2, padding=padding, bias=False), 
           nn.BatchNorm3d(out_channels,affine=True),
           nn.ReLU(inplace=True)
           )  
class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.transpose_conv_1 = transpose_conv_G(hyper_nb_noise_channels,512,2)
        self.transpose_conv_2 = transpose_conv_G(512,256,2)
        self.transpose_conv_3 = transpose_conv_G(256,128,2)
        self.transpose_conv_4 = transpose_conv_G(128,64,2)
        self.transpose_conv_5 = transpose_conv_G(64,nc,3)
        self.myTanh           = Trainable_Tanh()
    def forward(self, x):
        t_c_1  = self.transpose_conv_1(x)
        t_c_2  = self.transpose_conv_2(t_c_1)
        t_c_3  = self.transpose_conv_3(t_c_2)
        t_c_4  = self.transpose_conv_4(t_c_3)
        t_c_5  = self.transpose_conv_5(t_c_4)
        result = self.myTanh(t_c_5)
        return result

Here is the torch summary, and i see that the tanh layer don’t have the 2 additional trainable parameters… do you have an idea on how to fix my code please ?

ptrblck · September 20, 2022, 8:24am

Your implementation looks correct and I also see the parameters:


hyper_nb_noise_channels = 1
nc = 1
model = Generator()
print(dict(model.named_parameters()))
...
'myTanh.c': Parameter containing:
tensor(0.5000, requires_grad=True), 'myTanh.k': Parameter containing:
tensor(0.2500, requires_grad=True)}

from torchinfo import summary
summary(model)
=================================================================
Layer (type:depth-idx)                   Param #
=================================================================
Generator                                --
├─Sequential: 1-1                        --
│    └─ConvTranspose3d: 2-1              32,768
│    └─BatchNorm3d: 2-2                  1,024
│    └─ReLU: 2-3                         --
├─Sequential: 1-2                        --
│    └─ConvTranspose3d: 2-4              8,388,608
│    └─BatchNorm3d: 2-5                  512
│    └─ReLU: 2-6                         --
├─Sequential: 1-3                        --
│    └─ConvTranspose3d: 2-7              2,097,152
│    └─BatchNorm3d: 2-8                  256
│    └─ReLU: 2-9                         --
├─Sequential: 1-4                        --
│    └─ConvTranspose3d: 2-10             524,288
│    └─BatchNorm3d: 2-11                 128
│    └─ReLU: 2-12                        --
├─Sequential: 1-5                        --
│    └─ConvTranspose3d: 2-13             4,096
│    └─BatchNorm3d: 2-14                 2
│    └─ReLU: 2-15                        --
├─Trainable_Tanh: 1-6                    2
=================================================================
Total params: 11,048,836
Trainable params: 11,048,836
Non-trainable params: 0
=================================================================