Abstract: We consider the continuous-time temporal difference (TD) learning dynamics with nonlinear value function approximations, where there is a slim understanding of the convergence properties in ...