Hi,

I am new to using PyTorch and really like the Pythonic approach it offers. Currently I’m trying to do a Neural Net performance comaprison between a simple Depthwise Separable Convolutional Neural Net and a standard Convolutional Neural net. The model architecture of both is simple:

The following are done 3 times

Convolution-> Batch Normalization-> ReLU -> Pool

Followed by flattening it to 3 outputs in two steps.

The code is as following:

**Standard Conv. Net architecture:**

```
in_features = 304 #in_features for Flatten(linear) layer
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 4, kernel_size=2, padding=1)
self.bn1 = nn.BatchNorm2d(num_features=4)
self.relu1 = nn.ReLU()
self.pool1 = nn.MaxPool2d(2)
self.conv2 = nn.Conv2d(4, 8, kernel_size=2, padding=1)
self.bn2 = nn.BatchNorm2d(num_features=8)
self.relu2 = nn.ReLU()
self.pool2 = nn.MaxPool2d(2)
self.conv3 = nn.Conv2d(8, 16, kernel_size=2, padding=1)
self.bn3 = nn.BatchNorm2d(num_features=16)
self.relu3 = nn.ReLU()
self.pool3 = nn.MaxPool2d(2)
self.fc1 = nn.Linear(in_features, 36)
self.relu4 = nn.ReLU()
self.fc2 = nn.Linear(36, 3)
def forward(self, x):
out = self.pool1(self.relu1(self.bn1(self.conv1(x))))
out = self.pool2(self.relu2(self.bn2(self.conv2(out))))
out = self.pool3(self.relu3(self.bn3(self.conv3(out))))
out = out.view(out.size(0), -1)
out = self.relu4(self.fc1(out))
out = self.fc2(out)
return out
standardCNN = Net() # defining an instance of our network
```

**Depthwise Separable Conv. Net architecture:**

```
class depthwise_separable_conv(nn.Module):
def __init__(self):
super(depthwise_separable_conv, self).__init__()
self.depthwise1 = nn.Conv2d(1, 1, kernel_size=2, padding=1, groups=1)
self.pointwise1 = nn.Conv2d(1, 4, kernel_size=1)
self.bn1 = nn.BatchNorm2d(num_features=4)
self.relu1 = nn.ReLU()
self.pool1 = nn.MaxPool2d(2)
self.depthwise2 = nn.Conv2d(4, 4, kernel_size=2, padding=1, groups=4)
self.pointwise2 = nn.Conv2d(4, 8, kernel_size=1)
self.bn2 = nn.BatchNorm2d(num_features=8)
self.relu2 = nn.ReLU()
self.pool2 = nn.MaxPool2d(2)
self.depthwise3 = nn.Conv2d(8, 8, kernel_size=2, padding=1, groups=8)
self.pointwise3 = nn.Conv2d(8, 16, kernel_size=1)
self.bn3 = nn.BatchNorm2d(num_features=16)
self.relu3 = nn.ReLU()
self.pool3 = nn.MaxPool2d(2)
self.dense1 = nn.Linear(304, 36)
self.relu4 = nn.ReLU()
self.dense2 = nn.Linear(36,3)
def forward(self, x):
out = self.pool1(self.relu1(self.bn1(self.pointwise1(self.depthwise1(x)))))
out = self.pool2(self.relu2(self.bn2(self.pointwise2(self.depthwise2(out)))))
out = self.pool3(self.relu3(self.bn3(self.pointwise3(self.depthwise3(out)))))
out = out.view(out.size(0), -1)
out = self.relu4(self.dense1(out))
out = self.dense2(out)
return out
dsCNN_model = depthwise_separable_conv()
```

Now, here’s the important part. As the Standard Convolutions require a large number of parameters as compared to depthwise, I was hoping to get reduced number of parameters for my `dsCNN_model`

. But to my surprise, I got these results.

**Parameters:**

```
total_params = 0
for parameter in standardCNN.parameters():
if parameter.requires_grad:
total_params += np.prod(parameter.size())
print(total_params)
>>11831
total_params = 0
for parameter in dsCNN_model.parameters():
blah blah...
print(total_params)
>>11404
```

Both having almost the same with a mere reduction of **400** parameters (from ~11800 to ~11400).

I wanted to ask, is it that this is normal behavior of Depthwise model and we do not see that much of a differnce in smaller models? Or is there something wrong with my approach, or in the model architecture that is causing this? I’m actually trying to get a model with least parameters so the inference is possible on a hardware with less computational resources. Also, what other approaches are there in PyTorch’s arsenal that can be used for this purpose.

Any help would be really appreciated… Thanks!