Sunday 10 November 2019

Why Softmax Cost Function?

First lets consider other cost functions like mean squared error. For instance the actual output is 1 billionth and desired is 1 , it is conspicuous that there is almost no gradient for a logistic unit to fix up. Secondly, in scenarios where we are dealing with mutually exclusive classes the sum of probabilities is not guaranteed to be 1.

In short to force the probabilities to sum up to 1 we use Softmax Activation function.

reference : https://www.youtube.com/watch?v=PHP8beSz5o4