Learnable scalars

Niki · February 6, 2020, 12:36am

Hi, I am looking for mapping inputs by learnable scalars, this is correct?

for batch_idx, (inputs, targets) in enumerate(trainloader):
            if use_cuda:
                inputs, targets = inputs.cuda(), targets.cuda()
                
            shape = torch.Size((batch_size, 3, 32, 32))
            scalar = torch.cuda.FloatTensor(shape)
            M1 = torch.rand(shape, out=scalar)
          
            inputs_new = inputs * M1

ptrblck · February 6, 2020, 6:51am

Could you explain your use case a bit more?
It seems there are no trainable parameters in your current code.

Niki · February 6, 2020, 10:15am

Thank you for your reply, @ptrblck.
I want to do some mappings/transferring over the inputs. I saw this https://discuss.pytorch.org/t/multiply-feature-map-by-a-learnable-scalar/1520 should use variable and requires_grad? not sure how I can apply that.

G.M · February 6, 2020, 10:56am

You should make that scalar a nn.Parameter.

Niki · February 6, 2020, 12:48pm

Thank you for your reply, @G.M.
So, in model I should add

class Model(nn.Module):
    def __init__(self, batch_size):
        super(Model, self).__init__()
        shape = torch.Size((batch_size, 3, 32, 32))
        scalar = torch.FloatTensor(shape)
        self.multp = nn.Parameter(torch.randn(shape, out=scalar))

and in training should add

def train(epoch):
     for batch_idx, (inputs, targets) in enumerate(trainloader):
            if use_cuda:
                inputs, targets = inputs.cuda(), targets.cuda()
 
            M1 = net.parameters()
            inputs_new = inputs * M1
            outputs = net(inputs_new)

If this is correct, how I can give batch_size to the network in every epoch ? since I am defining net = models.__dict__[args.model](num_classes) before training function and I am considering batch_size as batch_size = inputs.size(0). I should bring net inside the training function?

G.M · February 6, 2020, 1:58pm

No, you should use the parameter in your model code: scalar = nn.Parameter(torch.randn(shape)).
Pytorch supports scalar multiplication like this: (B,C,H,W)*(C,H,W).

Niki · February 6, 2020, 4:03pm

Thank you, @G.M. How should I apply this in optimizer?
the current one is this

optimizer = torch.optim.Adam(net.parameters(), lr=args.lr, betas=(0.9, 0.999), weight_decay=args.decay)

G.M · February 7, 2020, 2:27am

Include that scalar into your model, like make it an attribute to your model in the constructor.

Niki · February 7, 2020, 2:44am

you mean just adding

self.scalar= nn.Parameter(torch.randn(3,32,32))

in the model?

G.M · February 7, 2020, 1:33pm

Yeah something like this.

Niki · February 17, 2020, 1:08pm

If I want to find random numbers from uniform distributions between [0,3) using

torch.empty(3, 32,32).uniform_(0, 3)

is correct?

and from Gaussian distribution with mean 0 and std 9

torch.normal(0, 9, size=(3,32,32))

I mean these will do the same as torch.rand and torch.randn but in different intervals?

How should define mean and std for torch.normal?

 self.m = nn.Parameter(torch.normal(torch.tensor(0), torch.tensor(9), size=(3,32,32)))
TypeError: normal() received an invalid combination of arguments - got (Tensor, Tensor, size=tuple), but expected one of:
 * (Tensor mean, Tensor std, torch.Generator generator, Tensor out)
 * (Tensor mean, float std, torch.Generator generator, Tensor out)
 * (float mean, Tensor std, torch.Generator generator, Tensor out)

G.M · February 18, 2020, 12:28am

I guess u would have to use torch.empty(3, 32, 32).normal(0, 9), or torch.normal(0, 9, tc.empty(3, 32, 32))