Hello altruists,
I am new to this domain and trying to understand a block of model. I am looking into the code for a paper named “CenterNet: Objects as Points”. Let’s say we have feature from the backbone and the shape of feature = [12, 256, 100, 100]. Now, this feature is fed into the below block:
bbox_tower= Sequential(
Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
GroupNorm(32, 256, eps=1e-05, affine=True)
(ReLU()
Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
GroupNorm(32, 256, eps=1e-05, affine=True)
ReLU()
Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
GroupNorm(32, 256, eps=1e-05, affine=True)
ReLU()
Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(GroupNorm(32, 256, eps=1e-05, affine=True)
ReLU()
)
So, in this block we are doing same convolution 4 times with Group Normalization. I am trying to understand what we are expecting after doing the same convolutions on the feature.
Then the output from the bbox_tower is passed to the below block:
agn_hm= Conv2d(256, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
I am assuming in the above block class agnostic heatmap is generated from the bbox_tower output. Am I right?
Could anyone please give me a brief idea about the working mechanism of the above code snippet and guide me how exactly we are achieving class agnostic heatmap from the above two module.
Thanks in advance!!!