Understanding CNN diagram

Hi! I am trying to understand this diagram and I wonder if someone can help me. In the Conv layer, why did it have five layers?

Hard to say by just looking at the small image you copied.

I am guessing that the 4 in conv(4,4) refers to the kernel_size being 4x4. Which is a weird size since kernel size is usually always an odd number (check here if you want to know why).
The 64 ch is probably the number of out_channels. This would mean tho that the drawn diagram should actually contain 64 of those grey boxes and not just 5. But I guess they just couldn’t be bothered to draw 64 of them.

Those are all just guesses/assumptions on what I can see here. If you could link the source, I could maybe give clearer answers.

Thank you for your help. Here is the source for the image:

Hmmm. The source is indeed not very specific about this.
I also couldn’t find this particular model on their gihub page.

But I believe everything is as I said earlier: