Relations (1)

related 2.58 — strongly supporting 5 facts

Question answering and summarization are both core natural language processing tasks frequently evaluated together in benchmarks like RAGTruth [1] and HaluEval [2], supported by the same encoder-decoder architectures [3], and performed by large language models like GPT-3 [4] and other modern LLMs [5].

Facts (5)

Sources
A survey on augmenting knowledge graphs (KGs) with large ... link.springer.com Springer 2 facts
referenceEncoder-decoder architectures, such as T5 or BART (Bidirectional and Auto-Regressive Transformers), use an encoder to create a context-rich representation of the input sequence, which the decoder then uses to generate an output sequence, making them flexible for tasks like translation, summarization, and question answering.
measurementOpenAI's GPT-3 model contains 175 billion parameters and is known for high-quality text generation, translation, question answering, and summarization.
The Hallucinations Leaderboard, an Open Effort to Measure ... huggingface.co Hugging Face 1 fact
measurementHaluEval includes 5,000 general user queries with ChatGPT responses and 30,000 task-specific examples across three tasks: question answering (HaluEval QA), knowledge-grounded dialogue (HaluEval Dialogue), and summarisation (HaluEval Summarisation).
Detecting hallucinations with LLM-as-a-judge: Prompt ... - Datadog datadoghq.com Aritra Biswas, Noé Vernier · Datadog 1 fact
referenceRAGTruth is a human-labeled benchmark for hallucination detection that covers three tasks: question answering, summarization, and data-to-text translation.
Combining Knowledge Graphs and Large Language Models - arXiv arxiv.org arXiv 1 fact
claimCurrent Large Language Models have a wide range of applications including question answering, code generation, text recognition, summarization, translation, and prediction.