What's the point of an activation function for the output layer for a regression problem?

Dev2150 · January 8, 2023, 1:23pm

For instance, the DQN will perform the decision with the highest Q-value.
Is there a point in processing the output layer e.g. with a .softmax()?

J_Johnson · January 8, 2023, 3:36pm

Sigmoid maps the values between 0 and 1. This is typically done for probabilities and works well with certain loss functions. Perhaps the person who made the DQN network just did so out of habit. I’ve also seen dropout commonly used in DQNs but have never seen anyone prove this is of any benefit in those types of networks.

Check your reward function. If it’s giving values outside of the range of 0 to 1, you might do well to try removing the final sigmoid function.