How we know that our estimated function is differentiable?

Mahdi_Amrollahi · July 23, 2022, 5:01pm

In deep neural networks, sometimes we have a complicated estimated function at the end including many multiplication and applying activation functions. How do we know that is differentiable or not? We just add layers and changing activation functions without knowing it.

ptrblck · July 23, 2022, 10:26pm

If you are unsure if a specific operation is differentiable you could check if it’s output has a valid .grad_fn attribute.

Mahdi_Amrollahi · July 24, 2022, 4:47am

Thanks, but I still have problem.
Suppose that we have a derivative loss function by degree=3( loss= x^3), so here we cannot find the min of the function with gradient descend.
To express more, let’s suppose we are tuning a deep nn model to find the better model. We try to add layers, changing hyperparams, changing activation function, adding or removing some neurons and … without caring about what the final function we are supposed to achieve. We do not know how the shape of the loss function will be. Is that convex or concave or none of them? So, how we can understand that we can find the min of the loss function with gradient descend?
I appreciate.

smth · July 24, 2022, 5:43am

you could apply the function on toy (or artificially constructed) datasets with known solutions – and test if descent methods bring you to these solutions.
If the function + some effort in tuning hyperparameters can find the known solutions on these carefully constructed datasets, you might have a better understanding of your function’s curvature and it’s feasibility to apply gradient methods.

Mahdi_Amrollahi · August 12, 2022, 9:57am

Most of the time our loss function is nonconvex, so we may have local minimum. But this is not a problem when we are working on high dimensional space. Because any parameter can relieve us from the local minimum