Abstract: We consider the continuous-time temporal difference (TD) learning dynamics with nonlinear value function approximations, where there is a slim understanding of the convergence properties in ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results