1, max is linear in the range greater than 0, can it still play the role of adding non-linear elements?
2. Why is leaky ReLU not more common than ReLu as an advanced version of ReLU?
1, max is linear in the range greater than 0, can it still play the role of adding non-linear elements?
2. Why is leaky ReLU not more common than ReLu as an advanced version of ReLU?
ReLu is linear in the interval greater than 0 and linear in the part less than or equal to 0, it is not linear as a whole, because it is not a straight line .
the combination of multiple linear operations is also a linear operation. Without nonlinear activation, there is only one hyperplane to divide the space. But ReLu is nonlinear, the effect is similar to dividing and folding space, combining multiple (linear operation + ReLu) can divide space arbitrarily.
many improved versions of ReLU:: leaky relu, prelu, elu, crelu. Each has its own effect and performance, which is not as common as relu:
reference:
agrees with the answer upstairs, and add one more point:
explanation of linear nonlinearity: in mathematics, a function is a linear function, so the function is a straight line; and all the remaining cases are nonlinear functions.
according to the above definition, a broken line is not a linear function, and Relu is a kind of broken line, so it is nonlinear.
linearity means that the highest power of all variables is 1, for example, 3x + 5y is linear, 3x ^ 2 + 5y is not linear
Previous: EasyUi cannot be loaded dynamically