Binary/Piecewise activation function

ily83 · February 7, 2022, 7:02pm

Hello, how can I create a custom activation function like the binary step for example:
binary = lambda x: np.where(x>=0, 1, 0) ?

I tried “activation = lambda x: torch.where(x < 0.5, 1., 0.)” and i run into an error:

runtimeError Traceback (most recent call last)
/tmp/ipykernel_615/1226976390.py in
69 x_colloc_tens = torch.tensor(x_colloc,requires_grad=True).float().to(device)
70 t_colloc_tens = torch.tensor(t_colloc,requires_grad=True).float().to(device)
—> 71 f_out = f(x_colloc_tens,t_colloc_tens)
72 zeros_tens = torch.tensor(np.zeros((taille_dataset,1)),requires_grad=True).float().to(device)

/tmp/ipykernel_615/828930035.py in f(x, t)
8 u = modele(entry_data)
----> 9 u_x = torch.autograd.grad(u, x, torch.ones_like(u),retain_graph=True,create_graph=True)[0]
10 u_t = torch.autograd.grad(u, t, torch.ones_like(u),retain_graph=True,create_graph=True)[0]

~/.conda/envs/default/lib/python3.9/site-packages/torch/autograd/init.py in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused)
232 retain_graph = create_graph
233
→ 234 return Variable.execution_engine.run_backward(
235 outputs, grad_outputs, retain_graph, create_graph,
236 inputs, allow_unused, accumulate_grad=False)

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

anantguptadbl · February 8, 2022, 1:05pm

@ily83
We can use it like this

class custom_activation(nn.Module):
    def __init__(self):
        super(custom_activation, self).__init__()
    
    def forward(self, x):
        x[x>=0] = 1
        x[x<0] =0
        return x

class random_model(nn.Module):
    def __init__(self, num_layers):
        super(random_model, self).__init__()
        self.layer1 = nn.Linear(100, 20)
        self.layer2 = nn.Linear(20, 10)
        self.layer3 = nn.Linear(10, 1)
        self.custom = custom_activation()
        self.sigm = nn.Sigmoid()
    
    def forward(self, x):
        x = self.custom(self.layer1(x))
        x = self.layer2(x)
        x = self.custom(x)
        x = self.layer3(x)
        x = self.sigm(x)
        return x

KFrank · February 8, 2022, 5:04pm

Hi Bou!

Please note that such an activation is almost certainly not what you want
because it is not usefully differentiable. That is, its derivative is zero
everywhere (except at x = 0.5 where, technically, its derivative is not
defined).

So even if you write a version that supports pytorch’s autograd automatic
differentiation, such as by using the approach that Anant suggested, any
gradients you try to backpropagate through your custom activation function
will become zero.

(If you want to backpropagate through a step-like function, you would
typically use a “soft” step function such as sigmoid().)

Best.

K. Frank