Sum of the output_grad for things below the threshold, zero otherwise.
You can see this by looking sternly at the max formulation or, if you prefer, rewrite as relu(x-t)+t.
Best regards
Thomas
Sum of the output_grad for things below the threshold, zero otherwise.
You can see this by looking sternly at the max formulation or, if you prefer, rewrite as relu(x-t)+t.
Best regards
Thomas