What is the most optimal shape of a tensor for storage and computational efficiency?

What is the most optimal shape of a tensor for storage and computational efficiency?
Is it sufficient for the total length of the tensor to be a multiple of 8, or does each dimension need to be a multiple of 8?
Or are there other answers?