claim
Incorporating domain-specific knowledge and introducing a 'not sure' category as one of the answer categories improves precision and F1 scores by up to 38% relative to baselines in the MedHallu benchmark.
Authors
Sources
- [2502.14302] MedHallu: A Comprehensive Benchmark for Detecting ... arxiv.org via serper
Referenced by nodes (3)
- Precision concept
- F1 score concept
- Domain-Specific Knowledge concept