- Squashes output to range -1 to 1. ReLU often works better than tanh in hidden layers.

ðŸ¤“ Softmax: f(x)_i = e^{x_i} / âˆ‘_j e^{x_j}

- Used for multi-class classification where the outputs represent class probabilities.

ðŸ˜Ž Leaky ReLU: f(x) = max(Î±x, x) where Î± is a small positive value like 0.01

- Solves the “dying ReLU” problem where a unit may stop learning if its input is negative.

ðŸ˜Ž ELU (Exponential Linear Unit): f(x) = x for x > 0, f(x) = Î±(e^x – 1) for x < 0

- Slightly better than ReLU by allowing negative values close to zero.

So in summary, ReLU, sigmoid, tanh are commonly used in hidden layers while softmax is popular in output layers for classification tasks.

Join the conversation

Types of Activation Functions that I know of so far in life:
Sigmoid Activation Function: 1/(1+e^-x)
Tanh Activation Function: (e^x - e^-x)/(e^x + e^-x)
Relu Activation Funtion: max(0,x)
Leaky Relu Activation Function: max(0.1x,x)
Parametric Relu Activation Function: max(ax,x)
Softmax Activation Function: e(i) / sum of e(i...)

Reply

Output Layer ...

Reply

output layer

Reply