Q-learning
Q-learning and SARSA algorithms
In this section we descuss off-line and on-line algorithms to compute
the optimal policy in case the exact model is not known.
Q-learning
SARSA
Yishay Mansour
2000-01-07