AWS brings anti-hallucination Guardrails to Bedrock, promises 'grounding' in reality
Cloud giant wants to make sure model responses are useful and founded on a firm foundation of enterprise data, rather than wild flights of fancy
AWS has debuted anti-hallucination technology for its Guardrails toolkit and extended the AI safety package to include apps and models outside its own Bedrock environment.
The cloud giant unwrapped the expansion of Guardrails for Amazon Bedrock at its AWS Summit in New York.
The expanded tool kit includes contextual grounding checks, which the firm described as “a new policy type to detect hallucinations” in a blog.
It describes the anti-hallucination measures as “a new and fifth safeguard that enables hallucination detection in model responses that are not grounded in enterprise data or are irrelevant to the users’ query”, which can be used to “improve response quality in use cases such as RAG, summarization, or information extraction”.
It relies on a combination of “grounding” and “relevance” parameters. Grounding involves setting a confidence score that information is factually correct “based on the information provided in the reference source and does not contain new information beyond the reference source”. Relevance represents the “minimum confidence score for a model response to be relevant to the user’s query.”
In both cases, responses with a score below the defined threshold will be blocked. So, customers will need to gauge the quality of their own source enterprise data underlying the models, and the “accuracy tolerance” of their specific use case.
“For example, a customer-facing application in the finance domain may need a high threshold due to lower tolerance for inaccurate content,” AWS wrote.
Can AWS beat hallucinations?
So, this is in no way a silver bullet. But that’s the thing about AI. It’s stats, not magic.
It can help enterprises prevent their own systems from serving up flights of fancy, whether internally or externally. But it won’t prevent employees, or even worse, leaders, from acting on hallucinations – or, indeed, good old misinformation – served up by systems beyond their control.
A study by researchers at Stanford in May of two bespoke AI legal services showed that they delivered up hallucinations. One service “hallucinated more than 34% of the time.”
Which is why the Turing Institute recommended back in April that it was critical for National Security decision makers, including politicians, to be fully aware of the limits of AI.
AWS also said that Guardrails now extends beyond the confines of Bedrock, with the release of an ApplyGuardrail API, which allows the evaluation of user inputs and model responses on self-managed or third-part foundation models.
It said this includes other AWS services, such SageMaker or EC2, but also on-prem deloyments, and even other, un-named third party FMs.
The company claims that Guardrails can block “as much as 85%” more harmful content over and above native FM tooling. That includes undesirable content, as well as prompt attacks and sensitive information.