What's the exactly meaning of 18(ie, 2*3*3) channels of the offset in a deformable convolution?

I want to visualize the offset of the deformable convolution with kernel size 3*3.
So It’s essential for me to know what’s the exact meaning of these channels.

I write down something possible here:

upper-left: ul
upper-right: ur
bottom-left: bl
bottom-right: br
up: u
bottom: b
right: r
left: l
center: c

delta_ul_x, delta_ul_y, delta_u_x, delta_u_y, delta_ur_x, delta_ur_y;
delta_l_x, delta_l_y, delta_c_x, delta_c_y, delta_r_x, delta_r_y;
delta_bl_x, delta_bl_y, delta_b_x, delta_b_y, delta_br_x, delta_br_y;

Could you add a bit more background to the question, please?
Is “deformable convolution” a custom modules, which uses these parameters?

Hi, ptrblck
It is a newly added module in torchvision, however, the document about the offset is not clear.

Thanks for the reference! Could you point me to the 2*3*3 usage from your original post?