Hello Tony,
Beta is indeed a constant and is not trainable in the standard definition.
I think that this post answers your question. You can define a custom operation where the beta is learnable.
Hello Tony,
Beta is indeed a constant and is not trainable in the standard definition.
I think that this post answers your question. You can define a custom operation where the beta is learnable.