Granite Guardian
Foundation-model content moderation: jailbreak, harm, bias, plus RAG groundedness, relevance, and answer-relevance checks. Runs via Ollama. Off-the-shelf safety component reused across agents.
Source ↗Composable components teams declare at design time, run during build, and keep active at runtime. Generic LLM safety on one side, science-specific reasoning on the other. Both are separate components in the AKD stack.
A non-exhaustive view of failure modes guardrails actively check for in scientific agent outputs:
Off-the-shelf LLM safety components reused across every agent in the catalog.
Foundation-model content moderation: jailbreak, harm, bias, plus RAG groundedness, relevance, and answer-relevance checks. Runs via Ollama. Off-the-shelf safety component reused across agents.
Source ↗Layered review pattern: input filters, intermediate-step checks, and output verification. A composition pattern, not a single component; AKD applies it across agent boundaries rather than gating only at one point.
Source ↗Components built specifically for scientific agent outputs: claim-level factuality, NASA risk taxonomies, and domain-aware compliance reasoning.
Claim-level factuality reasoning. Decomposes agent outputs into claims, scores attribution and supporting evidence, and surfaces unsupported assertions before they reach the user.
Source ↗LLM-judge that classifies and explains risks in agent outputs against two taxonomies: the IBM Risk Atlas and the NASA Science Literature Risk taxonomy. Domain-aware, importance-weighted, DAG-based evaluation graph.
Source ↗