next up previous
Next: About this document ... Up: No Title Previous: TD-Gammon

   
Calculating the Derivative of a Neural Network


  
Figure: Calculating the Derivative of a Neural Network
\begin{figure}\begin{picture}
(400,200)(0,100) % begins picture environment
\pu...
...0,187.5){\makebox(0,0)[c]{sigmoid$\rightarrow$ }}
\end{picture}
\end{figure}

Let z be the output of the neural network.
Let $w_{1},\ldots,w_{l}$ be the weights of the inputs from the second level to the output gate.
Let $y_{1},\ldots,y_{l}$ be the outputs of the gates in the second level.
Let $u_{i1},\ldots,u_{ik_{i}}$ be the weights of the inputs to gate yi.
Let $x_{1},\ldots,x_{k}$ be the inputs to the neural network.
(see figure [*])

\begin{displaymath}z = F(\overrightarrow{x})=\sigma(\Sigma w_{i}
\sigma(\overrightarrow{u_{i}}\overrightarrow{x_{i}}))
\end{displaymath}


\begin{displaymath}\frac{\partial z}{\partial u_{ij}} = \frac{\partial z}{\partial
y_{i}} \cdot \frac{\partial y_{i}}{\partial u_{ij}}
\end{displaymath}

In this case:

\begin{displaymath}\sigma(x) = \frac{1}{1+e^{-x}}
\end{displaymath}


\begin{displaymath}\frac{\partial}{\partial u_{ij}}\sigma(x) =
\frac{e^{-x}}{(1+e^{-x})^{2}} = e^{-x}\cdot\sigma^{2}(x)
\end{displaymath}


\begin{displaymath}\frac{\partial}{\partial y_{i}}\sigma(\Sigma y_{j}w_{j}) =
...
...cdot e^{-\Sigma y_{j}w_{j}}\cdot\sigma^{2}(\Sigma y_{j}w_{j})
\end{displaymath}



Yishay Mansour
2000-01-17