Half precision: ignoring a buffer

sebastienwood · April 6, 2020, 4:18pm

Hi,

I’m trying to switch a model to .half(). However, I defined a buffer which keeps tracks of the number of activation for a convolution layer that is used repeatedly in the network. Since the number of activations grows to be very large, half precision for this particular buffer overflow.

The code looks like so :

class convmodule(nn.Module):
  def __init__(self):
    self.register_buffer('fan_in', torch.Tensor([0]))
    (...)

  def update_fan_in(self, qty):
        self.fan_in += qty

# The forward hook
def count_act(self, input, output):
    """Hook to update the size of feature maps going through convmodule
    """
    with torch.no_grad():
        if str(output.device)=='cuda:0' or str(output.device)=='cpu': 
            self.update_fan_in(output.numel())

Is there a way to let this particular buffer stay in full precision while putting the rest of the tensors in half ?

(I was thinking of switching this buffer to pure python integer to avoid it being cast to half, but it messes with DataParallel – any idea is welcome ! ).

sebastienwood · April 6, 2020, 7:10pm

I tried at initialisation :

net = net.to(device).half()
for layer in net.modules():
    if isinstance(layer, nn.BatchNorm2d) or isinstance(layer, convmodule):
        layer.float()
        if isinstance(layer, convmodule): layer.fan_in.float()

I’m not sure if it’s legit but it’s a bit better on GPU RAM occupancy.