Fact — claim — Knowledge Tree

Off-policy evaluation (OPE) in reinforcement learning is the problem of estimating the expected long-term payoff of a target policy using experiences generated by a different, potentially unknown, behavior policy.

Authors

Person: Samuel Tesfazgi, Leonhard Sprandl, Sandra Hirche Organization: AISTATS
Track: Poster Session 3 - aistats 2026

Sources

Track: Poster Session 3 - aistats 2026 virtual.aistats.org Samuel Tesfazgi, Leonhard Sprandl, Sandra Hirche · AISTATS via serper

Referenced by nodes (1)

reinforcement learning concept