Relations (1)
related 1.00 — strongly supporting 1 fact
The concepts are related through the RL4HS framework, which specifically employs reinforcement learning techniques to perform hallucination detection as described in [1].
Facts (1)
Sources
EdinburghNLP/awesome-hallucination-detection - GitHub github.com 1 fact
referenceRL4HS is a reinforcement-learning framework for span-level hallucination detection that couples chain-of-thought reasoning with span-level rewards, utilizing Group Relative Policy Optimization (GRPO) and Class-Aware Policy Optimization (CAPO) to address reward imbalance between hallucinated and non-hallucinated spans.