Methods that support generalization


I have a problem implementing a deep reinforcement learning agent. Specifically, I am using double deep-Q-learning. I want to use it in order to solve a power control problem in telecommunications. I have a nodeA that walks in linear direction and sends a signal to a Node B with fixed position. The goal is to control the reicie
My input state is as followed:
I have a input tensor that consists out of two categories:
a) Current SNR and several last SNRs (int values between -20dB and +20dB)
b) actions that have been taken inbetween, to perform the controlling.
My training data contains several run of the running nodeA with random initialization of the starting position and the running direction, nodeB is always at the same position (for each training run).
So far I am just putting a tensor of the form [a, b] into the network. My output layer has 4 neurons that correspond to the number of actions the agent can take.
Before going to the hidden layers, my input vector goes through an embedding in order to get a better representation of the data.
My training works fine but still there occur some outliers, I cant get rid of (whether its by increasing the training data or increasing the number of neurons in the hidden layers).
My net looks as followed:

class Net(nn.Module):
def init(self):
super(Net, self).init()
self.emb1= nn.Embedding(45,4).to(device)
self.fc1 = nn.Linear(num_state4, 100).to(device)
self.fc11 = nn.Linear(100,100).to(device)
self.fc2 = nn.Linear(100, 100).to(device)
self.fc3 = nn.Linear(100,num_action).to(device)
def forward(self, x):
x = DoubleDQN.putInEmbeddingForm(x)
x = self.emb1(x).view((-1, num_state
x = F.relu(self.fc1(x))
x = F.relu(self.fc11(x))
x = F.relu(self.fc2(x))
action_value = self.fc3(x)
return action_value

I have the general question, is there a common way to improve the performance of such a net? I read a lot of stuff, like batch normalization or layer normalization and also tried some things which did not work. I just don’t have the experience in order to determine which techniques could help.
I someone could propose me some techniques that I can try.