Shapes and predictions questions

Hello Community,

I am a brand new to the AI world and I am trying to lean an Agent to play a card game for 4 player, and I thought that I can maybe share how I do thinks to know if am I on a good path :slight_smile:

Since all suite has same power I use here one-hot-encoding:$

diamonds = (1,0,0,0)
spades = (0,1,0,0)
hearts = (0,0,1,0)
clubs = (0,0,0,1)

Card encoding:
the game has 52 cards with 4 suits (diamonds, spades, hearts, clubs) since the cards has a power from its number I thought that I shouldn’t use the one-hot-encoding
1st digit represent if the card in the agent hand
2nd digit represent if the card has been played
3d to 6th the card suite
7th and last the card number

cards = [

About the states:
I have a NN with this shape Linear_QNet(7, 1024, 1) so the state for instance are all the agent cards and all the other players cards, and some information about the length of each suit on the agent hand and all the played cards

Question 1: If I did not add 6 zeros to game.get_player_clubs_count() to unify the size as the card tuple is of length of 7, I will get and inhomogeneous parts is this a good practice to add zero like that ? or there are some other strategy

Question 2: it is about the loop below the states, I do same as Question 1 but for the rows this times, tensor throw an error if the shape of each row is not equal to the first one. If you wonder why the shape change, it is because when a card is played I remove it from the player card. Also (game.get_player_clubs_count(self.current_player),0,0,0,0,0,0) has a row shape = 1

state = [
            *player_1_cards, # 13 tuples as initial state
            *(*player_2_cards, # 13 tuples as initial state
            *player_3_cards, # 13 tuples as initial state
            *player_4_cards), # 13 tuples as initial state
# This is necessary to normalize the shape of the states for later use with tensors
state = list(state).copy()
for i in range(54):
    if len(state) < 54:
       state.append(( -1,  -1,  -1,  -1,  -1, -1, -1))
return np.array(state, dtype=int)

Lastly, Predictions:
The final move should be a legal move, my question here it is about the argmax in the code below I use argmax but I reduce the predictions to the length of the legal moves.

Question 1: does the predictions output has the shape of the states ? I mean, here I have 9 * 54 = 486, the first 13 items of them are the agent cards, does the prediction output will map the states and the first 13 item they will be the agent cards ?

Question 2: I think I must update the prediction according to the legal move, should I put float(-inf) for all the predictions that are not a legal move ?

legal_moves, is_legal_move = game._get_legal_move_bin(game.current_collection_type, self.current_player)
_state = np.array(state)
state0 = torch.tensor(_state, dtype=torch.float)
prediction = self.model(state0)
move = torch.argmax(prediction[:len(legal_moves)]).item()
final_move = legal_moves[move]