next up previous
Next: Policy Sampling Up: Evaluating One Policy With Previous: Evaluating One Policy With

Importance Sampling

We have two sources D1(x) and D2(x) that produce differnt distributions. We compute expectation of a function F(x) on one source while sampling the other source. The expectation of F(x) with respect to distribution D is the sum of products of all values of X with the probability that D assigns that value. In our case:

ED2[F(x)] = $\sum{D_{2}}(X)F(x)$ = $\sum{D_{1}}(X)(\frac{D_{2}(X)}{D_{1}(X)})F(x)$ = ED1[( $\frac{D_{2}(X)}{D_{1}(X)}$)F(x)]

Input: Computation:

Yishay Mansour