EU AI Act Compliance¶

The EU AI Act (Regulation 2024/1689) imposes requirements on AI systems, particularly those classified as high-risk. groundlens is designed to help organizations meet several of these requirements through its deterministic, auditable architecture.

Why groundlens Helps with Compliance¶

The EU AI Act requires that high-risk AI systems be:

Transparent: Users must be able to understand how the system makes decisions.
Auditable: The decision-making process must be reproducible and inspectable.
Monitored: Ongoing quality assurance must be in place.
Documented: Technical documentation must describe the system's capabilities and limitations.

groundlens supports all four requirements by design.

No Black-Box Second LLM¶

The most common alternative to groundlens --- "LLM-as-judge" --- uses a second LLM to evaluate the first. This creates serious compliance problems:

Issue	LLM-as-Judge	groundlens
Determinism	Non-deterministic (sampling)	Deterministic (same inputs = same score)
Auditability	Opaque (why did the judge say "correct"?)	Transparent (distance ratio or cosine similarity)
Reproducibility	Varies across runs, model versions	Exact reproduction given same model and inputs
Cost	Requires LLM inference per evaluation	Sentence-transformer inference only
Circular risk	The judge LLM can itself hallucinate	No generative model in the evaluation loop

Key compliance advantage

groundlens removes the generative model from the evaluation loop entirely. The score is computed via deterministic mathematical operations on embeddings --- no sampling, no temperature, no prompt sensitivity.

Article 9: Risk Management¶

The EU AI Act requires a risk management system that identifies and mitigates risks throughout the AI system's lifecycle.

How groundlens helps: Deploy groundlens as a continuous monitoring layer that flags high-risk outputs for human review. The flagging rate provides a quantitative risk metric that can be tracked over time.

# Example: risk monitoring pipeline
from groundlens import evaluate

def risk_monitor(question, response, context=None):
    score = evaluate(question=question, response=response, context=context)
    return {
        "risk_level": "high" if score.flagged else "low",
        "score": score.value,
        "method": score.method,
        "explanation": score.explanation,
        "deterministic": True,
        "reproducible": True,
    }

Article 13: Transparency¶

High-risk AI systems must provide "sufficient transparency to enable deployers to interpret the system's output."

How groundlens helps: Every groundlens score comes with:

A numeric value with clear geometric meaning
A human-readable explanation
The method used (SGI or DGI)
Intermediate values (distances, normalized scores) for full traceability

score = evaluate(question="...", response="...", context="...")

# Full transparency chain
print(f"Method: {score.method}")
print(f"Raw score: {score.value}")
print(f"Normalized: {score.normalized}")
print(f"Flagged: {score.flagged}")
print(f"Explanation: {score.explanation}")

# For SGI, additional detail
if score.method == "sgi":
    print(f"Distance to question: {score.detail.q_dist}")
    print(f"Distance to context: {score.detail.ctx_dist}")

Article 14: Human Oversight¶

The Act requires that high-risk AI systems include measures for effective human oversight.

How groundlens helps: groundlens is explicitly designed as a triage tool --- it identifies which outputs need human review, not which outputs are "correct." This keeps humans in the loop while reducing the volume they need to review.

Without groundlens	With groundlens
Review 100% of outputs	Review ~20% of outputs (flagged)
Random or no prioritization	Prioritized by geometric risk score
No quantitative risk signal	Numeric score for risk ranking

Article 17: Quality Management¶

Organizations deploying high-risk AI must maintain a quality management system.

How groundlens helps: Use batch evaluation in CI/CD pipelines to gate deployments:

# In your CI pipeline
groundlens evaluate test_outputs.csv --output scored.csv

# Fail the deployment if flagged rate exceeds threshold
python -c "
import csv
with open('scored.csv') as f:
    rows = list(csv.DictReader(f))
    flagged = sum(1 for r in rows if r['groundlens_flagged'] == 'True')
    rate = flagged / len(rows)
    print(f'Flagged rate: {rate:.1%}')
    if rate > 0.10:
        print('FAIL: Flagged rate exceeds 10% threshold')
        exit(1)
"

Audit Trail¶

For regulatory audits, log every groundlens evaluation:

import json
import datetime
from groundlens import evaluate


def auditable_evaluate(question, response, context=None, **kwargs):
    score = evaluate(question=question, response=response, context=context, **kwargs)

    audit_record = {
        "timestamp": datetime.datetime.utcnow().isoformat(),
        "inputs": {
            "question": question,
            "response": response,
            "context": context,
        },
        "outputs": {
            "method": score.method,
            "value": score.value,
            "normalized": score.normalized,
            "flagged": score.flagged,
            "explanation": score.explanation,
        },
        "config": {
            "model": kwargs.get("model", "all-MiniLM-L6-v2"),
            "reference_csv": kwargs.get("reference_csv"),
        },
    }

    # Write to audit log
    with open("groundlens_audit.jsonl", "a") as f:
        f.write(json.dumps(audit_record) + "\n")

    return score

Known Limitations for Compliance¶

Be transparent about what groundlens does not guarantee:

Not a factual truth detector: groundlens measures geometric grounding, not factual accuracy. It cannot determine if "Paris is the capital of France" is true.
Confabulation boundary: Deliberately crafted false statements that mimic grounded patterns are not detectable (see Confabulation Boundary).
Threshold sensitivity: The flagging thresholds are empirically derived and may need tuning for specific use cases.

Documentation requirement

When documenting groundlens for regulatory purposes, include these limitations explicitly. The EU AI Act values honest documentation of capabilities and limitations over claims of perfection.