How can I make the lambda trainable in Softshrink function?

raningtky · October 16, 2017, 4:03pm

I do not want a fixed lambda in the Softshrink function. I want the network to learn this parameter lambda. Do I need to write a new function? Can anybody help me with it? Thanks so much!

richard · October 16, 2017, 8:05pm

You have to write a new Softshrink function that takes lambda as a variable. Example:

import torch
from torch.autograd import Variable
import torch.nn as nn

def softshrink(x, lambd):
    mask1 = x > lambd
    mask2 = x < -lambd
    out = torch.zeros_like(x)
    out += mask1.float() * -lambd + mask1.float() * x
    out += mask2.float() * lambd + mask2.float() * x
    return out

x = Variable(torch.randn(2,2,2), requires_grad=True)
l = Variable(torch.Tensor([0.5]), requires_grad=True)
out = softshrink(x, l)

# do things to out
y = sum(sum(sum(out)))
y.backward()
x.grad  # exists
l.grad  # also exists

raningtky · October 16, 2017, 8:31pm

Thanks so much! I really appreciate it.

raningtky · October 26, 2017, 10:03pm

Am I doing the right thing you teach me? Because when I try model.state_dict(), there is still no l in it. I think I am doing things wrong. How can I fix it?

#  Learnable softshrink func
def softshrink(x, lambd):
    mask1 = x > lambd
    mask2 = x < -lambd
    out = torch.zeros_like(x)
    out += mask1.float() * -lambd + mask1.float() * x
    out += mask2.float() * lambd + mask2.float() * x
    return out

# A Neural Network
class LISTA(torch.nn.Module):
    def __init__(self, D_in, D_out):
        super(LISTA, self).__init__()
        self.We = torch.nn.Linear(D_in, D_out)
        self.S = torch.nn.Linear(D_out, D_out)
        self.l = Variable(torch.Tensor([0.5]), requires_grad=True)

    def forward(self, x):
        out = softshrink(self.We(x),self.l)
        for i in range(20):
            out = softshrink(self.We(x)+self.S(out),self.l)
        out = nn.functional.sigmoid(out)
        return out

richard · October 26, 2017, 11:49pm

I think the key is to use Parameter instead of Variable.

Then l shows up in model.parameters() (but not in model.state_dict(), but I’m not sure it’s supposed to).

>>> import torch
>>> from torch.autograd import Variable
>>> from torch.nn import Parameter
>>>
>>> #  Learnable softshrink func
... def softshrink(x, lambd):
...     mask1 = x > lambd
...     mask2 = x < -lambd
...     out = torch.zeros_like(x)
...     out += mask1.float() * -lambd + mask1.float() * x
...     out += mask2.float() * lambd + mask2.float() * x
...     return out
...
>>> # A Neural Network
... class LISTA(torch.nn.Module):
...     def __init__(self, D_in, D_out):
...         super(LISTA, self).__init__()
...         self.We = torch.nn.Linear(D_in, D_out)
...         self.S = torch.nn.Linear(D_out, D_out)
...         self.l = Parameter(torch.Tensor([0.5]))
...     def forward(self, x):
...         out = softshrink(self.We(x),self.l)
...         for i in range(20):
...             out = softshrink(self.We(x)+self.S(out),self.l)
...         out = nn.functional.sigmoid(out)
...         return out
...
>>> model = LISTA(1, 1)
>>> list(model.parameters())
[Parameter containing:
 0.5000
[torch.FloatTensor of size 1]
, Parameter containing:
 0.6897
[torch.FloatTensor of size 1x1]
, Parameter containing:
-0.3733
[torch.FloatTensor of size 1]
, Parameter containing:
 0.4282
[torch.FloatTensor of size 1x1]
, Parameter containing:
-0.5218
[torch.FloatTensor of size 1]
]

raningtky · October 27, 2017, 12:12am

It helps. I really appreciate it.

raningtky · October 28, 2017, 1:03am

Just some follow-up. l appears in model.state_dict(), and I can save the model by model.save_state_dict().

benMen87 · December 12, 2017, 9:36am

Why do you add a sigmoid output to your LISTA model?

raningtky · December 12, 2017, 5:04pm

Just a different task. In LISTA, we do not need this sigmoid layer.

benMen87 · December 12, 2017, 5:17pm

I am currently working on a variant of LISTA. Can I as what task you are working on using LISTA?

louity · April 9, 2019, 2:58pm

This version is not very memory efficient. I trained a LISTA using the following softshrink :

def softshrink(x, lambd):
    return nn.functional.relu(x - lambd) - nn.functional.relu(-x - lambd)

it saved me 40% memory