Groundlens¶
Geometric LLM hallucination detection. No second LLM. Deterministic. Auditable.¶
Groundlens is a Python library that detects hallucinations in LLM outputs using embedding geometry rather than a second language model. It provides two complementary scoring methods:
- SGI (Semantic Grounding Index) --- measures whether a response engaged with provided source context, using distance ratios in embedding space.
- DGI (Directional Grounding Index) --- evaluates response grounding without any context, using directional statistics on displacement vectors.
Both methods are deterministic, sub-second, and produce auditable numeric scores --- no black-box LLM-as-judge.
Key Features¶
| Feature | Description |
|---|---|
| Two scoring modes | SGI (with context) and DGI (context-free) cover all verification scenarios |
| Deterministic | Same inputs always produce the same score --- no sampling variance |
| Sub-second | Sentence-transformer inference, not LLM generation |
| Domain calibration | Generic AUROC ~0.8; domain-specific calibration reaches 0.90--0.99 |
| EU AI Act ready | No opaque second LLM --- fully auditable decision pipeline |
| Provider wrappers | OpenAI, Anthropic, Google Gemini with automatic scoring |
| Framework integrations | LangChain, CrewAI, Semantic Kernel, AutoGen |
Quick Install¶
For provider-specific extras:
pip install "groundlens[openai]" # OpenAI provider
pip install "groundlens[langchain]" # LangChain integration
pip install "groundlens[all]" # Everything
3-Line Example¶
from groundlens import evaluate
score = evaluate(
question="What is the capital of France?",
response="The capital of France is Paris.",
context="France is in Western Europe. Its capital is Paris.",
)
print(score.flagged) # False
print(score.method) # 'sgi'
print(score.value) # 1.23 (example)
Setup in 30 Seconds¶
- Embed the question, response, and (optionally) context into \(\mathbb{R}^n\) using a sentence transformer.
- Compute a geometric score:
- With context: SGI = ratio of distances --- is the response closer to context or to the question?
- Without context: DGI = cosine similarity of the question-to-response displacement against a calibrated reference direction.
- Flag responses that fall below empirically-derived thresholds.
No token generation. No prompt engineering. No stochastic sampling. Pure geometry.
Research Papers¶
| Paper | Index | arXiv |
|---|---|---|
| Semantic Grounding Index for LLM Hallucination Detection | SGI | 2512.13771 |
| A Geometric Taxonomy of Hallucinations in LLMs | DGI | 2602.13224 |
| Rotational Dynamics of Factual Constraint Processing | Confabulation benchmark | 2603.13259 |
Author¶
Javier Marin --- javier@groundlens.dev and javier@jmarin.info
Groundlens is part of the CERT Framework for verification triage --- helping teams prioritize which LLM outputs need human review.