Linear Function Approximation policy evaluation
Monte Carlo: converges to minimal MSE
|| Q’MC - Qp||p= minw || Q’w - Qp||p º e
TD(0) converges close to MSE
|| Q’TD - Qp||p= e /(1-g) [TV]
DP may diverge
There exists counter examples [B,TV]
Previous slide
Next slide
Back to first slide
View graphic version