Code conventions inside Function forward(), backward() and the legacy implemenation

What is the rationale behind PyFunction::legacy_apply and PyFunction::apply and the surrounding framework? I can not get my head around this.

It seems that legacy functions unwrap input th.autograd.Variables and operate on th.XXTensor and then wrap their outputs again to th.autograd.Variables.

The _legacy flag seems to be set when forward() and backward() are not static methods.

But then the non-legacy version seems to expect Tensor, which are then wrapped into Variables, and then apply is called on those wrapped inputs. Then it does NOT wrap the outputs of apply (PyFunction::apply)?

It seems logical to me if I call some custom C code in a function I want to work on tensors and not on Variables, so the legacy code makes more sense to me.

Why was this changed and what is good practice with custom Functions?

An example:
This is the function CropAtPositions.py:

import torch as th
from torch.nn.parameter import Parameter
import numpy as np
from torch.nn import Module
from torch.nn import functional as F
from torch.autograd import Function

class CropAtPositions(Function):

#    @staticmethod DO I NEED THIS HERE?
    def forward(ctx, input, positions, size):
        # am i working with tensors now or with Variables ?
        ctx.input_size = input.size()
        ctx.size = size
        
        print 'CropAtPositions.forward'
        s = list((positions.size()[0],) + ctx.input_size)
        s[input.ndimension()] = ctx.size[1]
        s[input.ndimension()-1] = ctx.size[0]
        S = th.Size(s)

        out = input.data.new().resize_(*S)
        
        for i,pos in enumerate(positions.int()):
#            print pos[0], pos[0]+self.size[0], pos[1],pos[1]+self.size[1]
            out[i].copy_(input.data[...,pos[0]:pos[0]+size[0],pos[1]:pos[1]+size[1]])
        print out.size()
        return out

#    @staticmethod
    def backward(ctx, grad_output, positions):

        grad_input = grad_output.new().resize_(*ctx.input_size.tolist())
        grad_input.zero()

        for i, pos in enumerate(positions):
            grad_input[...,pos[0]:pos[0]+ctx.size[0],pos[1]:pos[1]+ctx.size[1]].add_(grad_output[i])

        return grad_input, None, None

Then the module CropAtPositions.py:

import torch as th
from torch.nn.parameter import Parameter

from torch.nn import Module
from skpr.nn import functional as F

class CropAtPositions(Module):
    def __init__(self, positions, size):
        super(CropAtPositions, self).__init__()
        self.pos = positions
        self.size = size

    def forward(self, input):
        print 'CropAtPositions.forward'
        return F.crop_at_positions(input,self.pos,self.size)

Then the wrapper functional.p:

from ._functions.Broadcast import Broadcast
from ._functions.CMul import CMul
from ._functions.CropAtPositions import CropAtPositions
from ._functions.FarFieldPoissonLikelihood import FarFieldPoissonLikelihood

def broadcast(x, ntimes):
    return Broadcast().forward(x, ntimes)

def cmul(x,y):
    return CMul().forward(x,y)

def crop_at_positions(input, positions, size):
    return CropAtPositions().forward(input, positions, size)
    
def farfield_poisson_likelihood(input, target, valid_mask):
    return FarFieldPoissonLikelihood().forward(input, target, valid_mask)

Please point out mistakes. A few questions:

  1. Should I assume tensors or Variables when I am writing code inside the forward() and backward()

  2. What is good practice of calling a Function? Is it Function().forward(x,y,z) or Function.apply(x,y,z), or something else?

  3. Should I return tensors or Variables from the forward() and backward() calls?

  4. Am I allowed to have an _ init _ method if I use the @staticmethod forward and backward? If yes, how do I make sure it is called?

Thanks for the help!