-
Off-policy Learning with Eligibility Traces: A Survey
In the framework of Markov Decision Processes, off-policy learning, that is the problem of learning a linear approximation of the value function of some fixed policy... -
Learning near-optimal policies with Bellman-residual minimization based fitte...
International audience
