Large Language Models ↔ BLEU

Relations (1)

related 0.30 — supporting 3 facts

Large Language Models and BLEU are related because BLEU is a metric used to evaluate text quality in large language models by comparing generated text to references [1], and automatic metrics like BLEU are criticized for failing to capture factual consistency in Large Language Models [2][3].

Facts (3)

Sources

Survey and analysis of hallucinations in large language models frontiersin.org Frontiers 2 facts

claimTraditional automatic metrics like BLEU, ROUGE, and METEOR are inadequate for assessing factual consistency in large language models, according to Maynez et al. (2020).

claimAutomatic metrics such as BLEU or ROUGE fail to capture factual consistency and reliability in Large Language Models, according to Maynez et al. (2020).

A survey on augmenting knowledge graphs (KGs) with large ... link.springer.com Springer 1 fact

formulaBLEU (Bilingual Evaluation Understudy) is a metric used to evaluate text quality in large language models integrated with knowledge graphs by comparing generated text to human-written reference texts, calculated as BLEU = BP * exp(sum(w_n * log(p_n))), where BP is the brevity penalty, w_n are weights, and p_n are precision scores for n-grams.