concept

Amazon Bedrock Agents

Facts (14)

Sources
Reducing hallucinations in large language models with custom ... aws.amazon.com Amazon Web Services Nov 26, 2024 14 facts
procedureThe cleanup process for the Amazon Bedrock Agents hallucination detection infrastructure follows this specific order: disable the action group, delete the action group, delete the alias, delete the agent, delete the Lambda function, empty the S3 bucket, delete the S3 bucket, delete AWS Identity and Access Management (IAM) roles and policies, delete the vector database collection policies, and delete the knowledge bases.
procedureTo customize RAGAS metrics for hallucination detection in the Amazon Bedrock Agents implementation, users can modify the measure_hallucination() method within the lambda_hallucination_detection() Lambda function.
measurementIn an example scenario using Amazon Bedrock Agents, a generated answer received a hallucination score of 0.4, which triggered an SNS notification because it was below the custom hallucination threshold of 0.9.
claimThe combination of Amazon Bedrock Agents, Amazon Knowledge Bases, and RAGAS evaluation metrics allows for the construction of a custom hallucination detector that remediates hallucinations using human-in-the-loop processes.
claimUsing Amazon Bedrock Agents can increase overall latency compared to using Amazon Bedrock Guardrails and Amazon Bedrock Prompt Flows because Amazon Bedrock Agents generate workflow orchestration in real time using available knowledge bases, tools, and APIs, whereas prompt flows and guardrails require offline design and orchestration.
referenceAmazon Bedrock Agents is a service designed for creating agents to orchestrate workflows.
claimThe custom hallucination detector implemented in Amazon Bedrock Agents uses RAGAS metrics, specifically 'answer correctness' and 'answer relevancy,' to determine the custom threshold score for triggering human intervention.
claimThe hallucination detection Lambda function implemented in the Amazon Bedrock Agents workflow is modular, allowing developers to swap the RAGAS evaluation framework for other frameworks.
procedureIn the Amazon Bedrock Agents hallucination detection workflow, if an agent's response fails to meet the custom hallucination score threshold, the system triggers a human-in-the-loop process by sending Amazon SNS notifications to customer service representative queues or Amazon Simple Queue Service (SQS) queues for email and text alerts.
claimThe Amazon Bedrock Agents implementation for hallucination reduction incurs no separate charges for building resources using Amazon Bedrock Knowledge Bases or Amazon Bedrock Agents, but users are charged for embedding model and text model invocations on Amazon Bedrock, as well as for Amazon S3 and vector database usage.
procedureThe Amazon Bedrock Agents hallucination detection workflow proceeds as follows: (1) Relevant answer chunks are retrieved from the knowledge base. (2) A knowledge base response is generated from the retrieved chunks. (3) The user query and knowledge base response are passed to a Lambda function. (4) The Lambda function calculates a hallucination score. (5) If the score is lower than a custom threshold, an SNS notification is sent to a customer service queue for human intervention. (6) If the score is higher than the threshold, the system returns the knowledge base response to the user.
claimAmazon Bedrock Agents allow organizations to implement scalable, customizable hallucination detection that adjusts to specific needs without requiring a complete restructuring of the existing workflow.
procedureAmazon Bedrock Agents orchestrate multistep tasks by using the reasoning capabilities of Large Language Models to break down user-requested tasks into steps, create an orchestration plan, and execute that plan by invoking company APIs or accessing knowledge bases via Retrieval-Augmented Generation (RAG).
claimAmazon Bedrock Agents enables dynamic workflow orchestration, addressing the limitations of static workflows where updating hallucination detection logic requires modifying the entire workflow.