Are there conventions for input format?

This is going to be a little hard to articulate, but I’m going to try my best.
Do input vectors to a network have to contain information that is consistent?
What I mean by this is, does it have to represent the same type of data?
For example, take word embedding. It’s clever because it ranks a word on one type of data: it’s semantic attributes.
My question may not be clear yet, so here’s my current situation:
We’re supposed to build a neural net that can play a board game. I won’t get into the specifics of the game unless it’s necessary, but basically you roll dice 4 dice, pair them up however you choose, and try to move along 11 columns on the game board. You can make multiple moves in one turn. There was previously made AI (not a neural net) that could do this fairly well, and we’ve managed to get it to output the probability of winning for all possible dice pairings on a given roll. So currently, our input to the network is a vector containing three board states (your position at the beginning of the turn, your current position, and the opponents position - each represented by 11 numbers denoting position in the 11 columns) and one possible set of dice pairings (2 numbers representing the sums of the paired dice). The target vector is the probability of winning for that set of dice pairings.
So the information in my input vector is inconsistent, it’s 3 sets of 11 numbers representing positions and then 2 numbers representing dice pairings. Does this matter to the neural network, or should it technically be able to learn anyway as long as there’s enough data and the input vectors keep this format?

In general, people use different planes of same sizes to represent various kinds of input