FAQ
Last modified by Nikita Kapchenko on 2019/09/28 18:47
Why sigmoid for activation function?
Generally, activation function should be:
- Continouos. (why? => stable derivatives (no jumps) => stable gradient descent algorithm
- Have easy to compute analytical derivatives
AND for particular problem:
- has appropriate output range like [-1, 1], all R or [0, 100]