If you phrase it as “every dilation
th element is used”, it may be easier to remember. When defining dilation (or any op in general) to me it seems natural to talk about what is used/done rather than what is skipped/not done.
So once you try to write down a formula, it probably is with this convention. Using it in code saves you from index juggling when putting formulas into code.
Of course, everyone has a different intuition about these things, but personally, I think the one pytorch implicitly suggests here can be useful.
Best regards
Thomas