Customize an activation function

If i want to customize an activation function, and can be easily called in torch.nn.functional. What should I do? Thanks


what do you mean by “customize an activation function”?
and why do you need it to be called in torch.nn.functional?

I guess, “customize an activation function” means “how to implement some custom activation functions of his own”.

If you can write your activation function using Torch math operations, you don’t need to do anything else to “implement” it.

Let’s implement a truncated gaussian for example.

def trucated_gaussian(x, mean=0, std=1, min=0.1, max=0.9):
    gauss = torch.exp((-(x - mean) ** 2)/(2* std ** 2))
    return torch.clamp(gauss, min=min, max=max) # truncate

I’m sorry I did not express what I meant, and I mean is how I could use my own function in a net instead of using the activation function provided in the pytorch framework. So,which document can reference? thanks.

One way to do it is to derive from nn.Module and implement the forward method. Don’t bother about backpropagation if you use autograd compatible operations. For example, @fmassa’s truncated gaussian from above:

import torch
import torch.nn as nn
from torch.autograd import Variable

class MyActivationFunction(nn.Module):

    def __init__(self, mean=0, std=1, min=0.1, max=0.9):
        super(MyActivationFunction, self).__init__()
        self.mean = mean
        self.std = std
        self.min = min
        self.max = max

    def forward(self, x):
        gauss = torch.exp((-(x - self.mean) ** 2)/(2* self.std ** 2))
        return torch.clamp(gauss, min=self.min, max=self.max)

my_net = nn.Sequential(
    nn.Linear(7, 5),

y = my_net(Variable(torch.rand(10, 7)))
y.backward(torch.rand(10, 5))

Note that it’s not necessary to derive from nn.Module to write your functions, and everything can be handled with normal python functions, and backpropagation will just work fine (if you pass in a Variable).

@fmassa Can you please give an example on that?
I tried to implement a simple exponential activation for my rnn

class my_rnn(nn.Module):
    def __init__(self, input_size=2, hidden_size=20, num_layers=3, output_size=1,
        super(my_rnn, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.output_size = output_size
        self.batch_size = batch_size
        self.rnn = nn.RNN(input_size=self.input_size, hidden_size=self.hidden_size, 
                          num_layers=self.num_layers, batch_first=True, nonlinearity=self.exp_activation)
        # The last layer is applied on the last output only of the RNN (not like
        # TimeDistributedDense in Keras)
        self.linear_layer = nn.Linear(self.hidden_size, self.output_size)
        self.hidden = Variable(torch.zeros((self.num_layers, self.batch_size, self.hidden_size))).cuda()
    def forward(self, input_sequence):
        out_rnn, self.hidden = self.rnn(input_sequence, self.hidden)
        in_linear = out_rnn[:, -1, :]
        final_output = self.linear_layer(in_linear)
        return final_output
    def init_hidden(self):
        self.hidden = Variable(torch.zeros((self.num_layers, self.batch_size, self.hidden_size))).cuda()

    def exp_activation(self, data):
        return torch.exp(data)

but it gives this error ValueError: Unknown nonlinearity '<bound method my_rnn.exp_activation of my_rnn ()>'

nn.RNN supports only tanh or relu for nonlinearity [code]. One easy way to solve your problem is to write your own loop over the sequence just like in this example.


maybe you are trying to call a non @staticmethod?
Anyway, if your exp_activation is outside of the class definition, you can use it normally.

def exp_activation_square(x):
    return torch.exp(x) ** 2

x = Variable(torch.rand(3), requires_grad=True)
y = Variable(torch.rand(3), requires_grad=True)

z = exp_activation_square(x * y).sum()
1 Like

Ah, @Tudor_Berariu is right, I didn’t see all your code so I didn’t get where the problem was.
Yes, if you want to try another non-linearity which is not currently supported in nn.RNN, you need to write your own for loop.
Although it wouldn’t be hard to add support for it in nn, but I’m not sure it’s a very common use-case

1 Like

Ah okay, I see your point now. Thank you @fmassa and @Tudor_Berariu

thank you !everyone,I will try it,

for video tutorial on activation functions

Hello I have a question for implementing activation function. How can we implement our own activation function that need parameter?, Now I want to make like thresholding function where the threshold is determined in training this is similar with PReLU but in here I have a custom additional operation.

1 Like

Similar to PReLU you could implement your own activation function writing a custom nn.Module (just like writing your model).

1 Like

you can write a customized act function like below (e.g. weighted Tanh)

class weightedTanh(nn.Module):
    def __init__(self, weights = 1):
        self.weights = weights
    def forward(self, input):
        ex = torch.exp(2*self.weights*input)
        return (ex-1)/(ex+1)

Hi @ptrblck thank you for your reply, could you give a simple example?, I have tried several things like this discussion , but its seem failed.

Hi Ptrblck,

I write the p-Sigmigd to have 3 learnable parameters as (Alpha, Beta, and Sigma). Would you please tell me that this implementation is correct?

class SigmoiidLearn(nn.Module):
    def __init__(self):
        super(SigmoiidLearn, self).__init__()
        self.Alpha = nn.Parameter(torch.ones(1), requires_grad=True)
        self.Beta = nn.Parameter(torch.zeros(1),requires_grad=True)   
        self.Sigma = nn.Parameter(torch.ones(1),requires_grad=True)

    def forward(self,input):
        for ii in range(input.shape[0]):
            for ii1 in range(input.shape[1]):
                for ii2 in range(input.shape[2]):
                    for ii3 in range(input.shape[3]):


        return Out

and the function that I call it as :

class Net(nn.Module):
    def __init__(self,ngpu,nz,ngf):
        super(Net, self).__init__()
        self.l1= nn.Sequential( nn.ConvTranspose2d(, self.ngf * 8, 3, 1, 0, bias=False),
            nn.BatchNorm2d(self.ngf * 8),

    def forward(self, input1):
        x = self.l1(input1)
        x =  self.l2(x)

        return x