Simple custom activation function causes CUDA out of memory.
Tags: Custom Activation Function, Memory Usage, CUDA out of memory
Problem
I am trying to implement a very simple activation function that turns every value that is higher than 1 to 1. I did the following:
def zolu(input):
input[input>1] = 1
return input
class ZOLU(nn.Module):
def __init__(self):
super().__init__() # init the base class
def forward(self, input):
return zolu(input)
And used the function as an activation in my also rather simple model:
class SuperResNet(nn.Module):
def __init__(self):
self.size = MODEL_INPUT_SIZE
super(SuperResNet, self).__init__()
self.up = nn.Upsample(size=MODEL_OUTPUT_SIZE[1:], mode="bicubic", align_corners=False)
self.conv1 = nn.Conv2d(in_channels=3, out_channels=10, kernel_size=5, padding=2)
self.conv2 = nn.Conv2d(in_channels=10, out_channels=50, kernel_size=5, padding=2)
self.conv3 = nn.Conv2d(in_channels=50, out_channels=10, kernel_size=5, padding=2)
self.conv4 = nn.Conv2d(in_channels=10, out_channels=3, kernel_size=5, padding=2)
def forward(self, x):
x = zolu(self.up(x))
x = zolu(self.conv1(x))
# x = TF.relu(x)
x = zolu(self.conv2(x))
# x = TF.relu(x)
x = zolu(self.conv3(x))
# x = TF.relu(x)
x = zolu(self.conv4(x))
# x = TF.relu(x)
return x
When i try to run training i get this output:
Traceback (most recent call last):
File "C:/Users/sokad/PycharmProjects/SuperResoStrekal/scripts/train.py", line 59, in <module>
outputs = super_res(crappy_train)
File "C:\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\sokad\PycharmProjects\SuperResoStrekal\scripts\model.py", line 47, in forward
x = zolu(self.conv2(x))
File "C:\Users\sokad\PycharmProjects\SuperResoStrekal\scripts\model.py", line 19, in zolu
input[input>1] = 1
RuntimeError: CUDA out of memory. Tried to allocate 5.62 GiB (GPU 0; 8.00 GiB total capacity; 1.16 GiB already allocated; 4.48 GiB free; 1.57 GiB reserved in total by PyTorch)
Further Infos
- instead of input[input > 1] i also tried input[input.gt(1)]
- when i switch the activations to RELU, everything works fine
- Total params: 26,573
- Trainable params: 26,573
- Input size (MB): 4.79
- Forward/backward pass size (MB): 546.48
- Input shape: torch.Size([4, 3, 560, 748])
- Batch-Size = 4
- Using Sampler-Class (CUDA was OOM without a sampler)