Comparison time complexity between Convolution and Fully connected models

hi
I designed the following model, but the convolution structure (2) has a much smaller number of parameters
But in terms of time complexity, it has more latency than the fully connected Foley model
What is the reason for this ??
If the model has fewer flops, the delay is less ??

How can I show that the convolution model has a much smaller number of flops, so the latency or time complexity is less?

both model has equal latency time for inference . is reasonable ???

model 1 : (Fully connected model)
class modelss(nn.Module):

``````def __init__(self):

super(modelss, self).__init__()

self.channel_num_in = 256

self.encoder = nn.Sequential(

nn.Linear(self.channel_num_in, 512),

nn.Linear(512, 512),

nn.Linear(512, 256),

nn.Linear(256, 128),

nn.Linear(128, 64),

nn.Linear(64, 32),

nn.Linear(32, 16),

)

self.fc = nn.Linear(16,7)

def forward(self, x):

x = self.layer(x)

x = x.view(x.size(0), -1)

x=self.fc(x)

return  x
``````

model = modelss()

``````    Layer (type)               Output Shape          Params           FLOPs           Madds
``````

========================================= Linear-1 [2, 512] 131,584 131,072 261,632 Linear-2 [2, 512] 262,656 262,144 523,776 Linear-3 [2, 256] 131,328 131,072 261,888 Linear-4 [2, 128] 32,896 32,768 65,408 Linear-5 [2, 64] 8,256 8,192 16,320 Linear-6 [2, 32] 2,080 2,048 4,064 Linear-7 [2, 16] 528 512 1,008 Linear-8 [2, 7] 119 112 217

Total params: 569,447 Trainable params: 569,447 Non-trainable params: 0 Total FLOPs: 567,920 Total Madds: 1,134,313

Input size (MB): 0.03
Forward/backward pass size (MB): 0.01
Params size (MB): 0.54
Estimated Total Size (MB): 0.58
FLOPs size (GB): 0.00

=========================================

model 2 : (Conv Model)

class ConvModel(nn.Module):

``````def __init__(self):

super(ConvModel, self).__init__()

self.channel_num_in = 1

self.layer = nn.Sequential(

nn.Conv2d(self.channel_num_in, 16,2,2),

nn.Conv2d(16, 8,2,2),

nn.Conv2d(8, 4,2,2),

)

self.fc = nn.Linear(16,7)

def forward(self, x):

x = self.layer(x)

x = x.view(x.size(0), -1)

x=self.fc(x)

return  x
``````

model = ConvModel()

``````    Layer (type)               Output Shape          Params           FLOPs           Madds
``````

===================================================== Conv2d-1 [2, 16, 8, 8] 80 5,120 8,192 Conv2d-2 [2, 8, 4, 4] 520 8,320 16,384 Conv2d-3 [2, 4, 2, 2] 132 528 1,024 Linear-4 [2, 7] 119 112 217

Total params: 851 Trainable params: 851 Non-trainable params: 0 Total FLOPs: 14,080 Total Madds: 25,817

Input size (MB): 0.03
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.04
FLOPs size (GB): 0.00

==================================

this code used for calculate inference time in models…

#model = EfficientNet.from_pretrained(‘efficientnet-b0’)

device = torch.device(“cuda”)

model.to(device)

dummy_input = torch.randn(128, 1, 16,16,dtype=torch.float).to(device)

starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)

repetitions = 300

timings=np.zeros((repetitions,1))

#GPU-WARM-UP

for _ in range(10):

_ = model(dummy_input)

MEASURE PERFORMANCE

for rep in range(repetitions):

`````` starter.record()

_ = model(dummy_input)

ender.record()

# WAIT FOR GPU SYNC

torch.cuda.synchronize()

curr_time = starter.elapsed_time(ender)

timings[rep] = curr_time
``````

mean_syn = np.sum(timings) / repetitions

std_syn = np.std(timings)

print(mean_syn)