Skip to content

SGI: Semantic Grounding Index

The Semantic Grounding Index (SGI) measures whether an LLM response engaged with provided source context or stayed semantically anchored to the question. It is the primary scoring method for RAG verification --- any scenario where you have a question, retrieved context, and a generated response.

Paper: Marin (2025). Semantic Grounding Index for LLM Hallucination Detection. arXiv:2512.13771.

Formula

\[ \text{SGI} = \frac{d(\phi(r), \phi(q))}{d(\phi(r), \phi(\text{ctx}))} \]

where:

  • \(\phi(\cdot)\) is the sentence embedding function (default: all-MiniLM-L6-v2)
  • \(d(\cdot, \cdot)\) is Euclidean distance in \(\mathbb{R}^n\)
  • \(r\) is the LLM response
  • \(q\) is the input question
  • \(\text{ctx}\) is the source context

Geometric Interpretation

SGI is a relative proximity measure. It compares how far the response embedding is from the question versus how far it is from the context:

  • SGI > 1: The response is closer to the context than to the question. This suggests the LLM engaged with the source material and incorporated its content.
  • SGI = 1: The response is equidistant from both. Ambiguous.
  • SGI < 1: The response is closer to the question than to the context. This suggests the LLM may have generated an answer from its parametric memory rather than the provided context.

Geometrically, the SGI = 1 boundary is the perpendicular bisector hyperplane between the question and context embeddings. Responses on the context side of this hyperplane score above 1; responses on the question side score below 1.

Threshold Zones

Zone SGI Range Interpretation Action
Strong pass SGI >= 1.20 Response strongly engaged with context Accept
Partial 0.95 <= SGI < 1.20 Some context influence detected Review recommended
Flagged SGI < 0.95 Weak context engagement Human review required

Normalization

The raw SGI score is normalized to [0, 1] using a tanh mapping:

\[ \text{SGI}_{\text{norm}} = \tanh\bigl(\max(0,\; \text{SGI} - 0.3)\bigr) \]

Reference points:

Raw SGI Normalized
0.30 0.000
0.95 0.457
1.20 0.604
2.00 0.885

When to Use SGI

SGI is the right choice when you have all three inputs:

  • A question or prompt
  • Retrieved context, source documents, or reference text
  • The LLM's response

Common scenarios:

  • RAG pipelines: Verify the LLM used the retrieved chunks
  • Document Q&A: Confirm answers cite the source material
  • Summarization: Check the summary reflects the input document
  • Grounded generation: Any task where you provide context and expect the model to use it

Limitations

SGI measures engagement, not correctness

SGI detects whether the response is semantically similar to the context. A response that paraphrases the context incorrectly might still score well on SGI if it uses similar vocabulary and concepts. SGI catches the case where the LLM ignores the context entirely --- it does not verify that the response accurately represents the context.

Context quality matters

If the retrieved context is irrelevant to the question, a high SGI score means the response is close to irrelevant material. SGI assumes the context is appropriate --- retrieval quality is a separate concern.

Short text sensitivity

Very short texts (1--3 words) produce embeddings with less discriminative power. SGI works best with texts of at least one full sentence.

API Reference

from groundlens import compute_sgi, SGI

# Function API
result = compute_sgi(
    question="What is the capital of France?",
    context="France is in Western Europe. Its capital is Paris.",
    response="The capital of France is Paris.",
    model="all-MiniLM-L6-v2",  # optional
)

# Class API (reusable)
sgi = SGI(model="all-MiniLM-L6-v2")
result = sgi.score(
    question="What is X?",
    context="X is Y.",
    response="X is Y.",
)

The SGIResult contains:

Field Type Description
value float Raw SGI score
normalized float Score in [0, 1]
flagged bool True if below review threshold
q_dist float Euclidean distance to question embedding
ctx_dist float Euclidean distance to context embedding
method str Always "sgi"
explanation str Human-readable interpretation