This is a very vague question :). Without knowing what you’re really after, this would be my best shot:
The neural network classifier is a complicated function y = f(x, p) that maps an input x (here: an image) to an output y (here: a class label). The function is parameterized by a set of parameters p. For different values of p, function f will do a better or worse job of mapping an image to its correct label. Training classifier now means to systematically trying to find the best values for p, where the “best” means that these values for p map most images in your training data to their correct class labels.
But I assume that’s not what your asking; so you might need to clarify your question.
I was asking this question in general context. Like, if we train any deep learning model then in order to explain the training steps in brief, should we need to tell about our features, labels, number of epochs and the loss function? The way of training the model?