+ 4

# Why does relu train faster than sigmoid?

2 Answers

+ 2

Efficiency: ReLu is faster to compute than the sigmoid function, and its derivative is faster to compute. This makes a significant difference to training and inference time for neural networks: only a constant factor, but constants can matter.

0

SITHU Nyein does it also have to do with the fact that relu has less noise (deactivates neurons below zero completely, unlike sigmoid)?