Next: About this document ... Up: No Title Previous: TD-Gammon

Calculating the Derivative of a Neural Network

**Figure:** Calculating the Derivative of a Neural Network
$\begin{figure}\begin{picture} (400,200)(0,100) % begins picture environment \pu... ...0,187.5){\makebox(0,0)[c]{sigmoid$\rightarrow$ }} \end{picture} \end{figure}$

Let z be the output of the neural network.
Let $w_{1},\ldots,w_{l}$ be the weights of the inputs from the second level to the output gate.
Let $y_{1},\ldots,y_{l}$ be the outputs of the gates in the second level.
Let $u_{i1},\ldots,u_{ik_{i}}$ be the weights of the inputs to gate y_i.
Let $x_{1},\ldots,x_{k}$ be the inputs to the neural network.
(see figure

)

$\begin{displaymath}z = F(\overrightarrow{x})=\sigma(\Sigma w_{i} \sigma(\overrightarrow{u_{i}}\overrightarrow{x_{i}})) \end{displaymath}$

$\begin{displaymath}\frac{\partial z}{\partial u_{ij}} = \frac{\partial z}{\partial y_{i}} \cdot \frac{\partial y_{i}}{\partial u_{ij}} \end{displaymath}$

In this case:

$\begin{displaymath}\sigma(x) = \frac{1}{1+e^{-x}} \end{displaymath}$

$\begin{displaymath}\frac{\partial}{\partial u_{ij}}\sigma(x) = \frac{e^{-x}}{(1+e^{-x})^{2}} = e^{-x}\cdot\sigma^{2}(x) \end{displaymath}$

$\begin{displaymath}\frac{\partial}{\partial y_{i}}\sigma(\Sigma y_{j}w_{j}) = ... ...cdot e^{-\Sigma y_{j}w_{j}}\cdot\sigma^{2}(\Sigma y_{j}w_{j}) \end{displaymath}$

Yishay Mansour
2000-01-17