FAQ

Last modified by Nikita Kapchenko on 2019/09/28 18:47

Why sigmoid for activation function?

Generally, activation function should be:

  1. Continouos. (why? => stable derivatives (no jumps) => stable gradient descent algorithm
  2. Have easy to compute analytical derivatives 

AND for particular problem:

  1. has appropriate output range like [-1, 1], all R or [0, 100]