Hi, I am new to pytorch. And I have a problem. In a deep learning network, I split a convolution into two GPU operations, but GPU memory consumption increases while training, and then it reports an error “RuntimeError: CUDA error: out of memory”

I tried to use del loss and torch.cuda.empty_cache() but it never works…Could you please help me?

This is the code snippet related to the conv part

class Conv2dBlock_Multi(nn.Module):

def **init**(self, input_dim ,output_dim, kernel_size, stride,

padding=0, norm=‘none’, activation=‘relu’, pad_type=‘zero’):

super(Conv2dBlock_Multi, self).**init**()

self.conv = Conv2d_Multi(input_dim, output_dim, kernel_size, stride, bias=True)

```
def forward(self, x):
x = self.conv(self.pad(x))
return x
```

class Conv2d_Multi(nn.Module):

def **init**(self,input_dim, output_dim, kernel_size, stride, bias=True):

super(Conv2d_Multi, self).**init**()

self.conv1 = nn.Conv2d(input_dim//2, output_dim//2, kernel_size, stride, bias=True)

self.conv2 = nn.Conv2d(input_dim - input_dim//2,output_dim - output_dim//2, kernel_size, stride, bias=True)

```
def forward(self, x):
x1, x2 = x.split([x.size()[1]//2, x.size()[1]-x.size()[1]//2], dim=1)
self.conv1 = nn.DataParallel(self.conv1, device_ids=[0]).to('cuda:0')
self.conv2 = nn.DataParallel(self.conv2, device_ids=[1]).to('cuda:1')
x1_out = self.conv1(x1)
x2_out = self.conv2(x2)
x2_out = x2_out.to('cuda:0')
x = torch.cat([x1_out,x2_out],dim = 1)
return x
```