Causal CNN explanation

Hi i want to understand how does Causal work for CNN , i know padding the sequence would introduce causality to the network but i am unclear how does it exactly work as padding would mean to pad the whole sequence and when stride happens it would still be the same set of pad.

assume the input to be ( 1,1,8,16,16) where 18 is the number of frames of video and 16 is H and W.

could anybody explain what should be done for causality in conv2d and conv3d both

Thanks