AI safety in Clinical Knowledge graph

If anyone would be interested to know about AI safety in Knowledge graph:

Abstract

Clinical Graph-LLMs achieve high benchmark accuracy while bypassing theirknowledge graphs—a failure we term Structural Hallucination. Correctness via memorisation rather than graph traversal is a safety risk: such models cannot be trusted when evidence is updated or in rare-disease settings where parametric priors are absent.

I formalise the Structural Alignment Score (SAS), measuring causal sensitivity to Counterfactual Edge Deletion (e∗removed from G):

image

State-of-the-art Graph-LLMs score SAS ≈0.00, confirming their correct predictions are structurally fraudulent. We propose Topology-Constrained Decoding(TCD), which hard-masks the LLM logit distribution to KG-verified neighbours at step 0, raising KG-grounded token probability from 0.33% to 100% (+99.7 pp,BioGPT) with SAS ≈0.94 and zero hallucination by construction. GAT-TCD extends this with a 2-layer Graph Attention Network that ranks the constraint set by structural relevance, achieving SAS ≈ 1.00 on primary therapeutic edges (e.g., Imatinib->ABL1) while deliberately yielding SAS ≈0.23 under random CED— concentrating faithfulness where clinical evidence is strongest. PrimeKG-trained GAT weights rank biologically plausible targets (e.g. artenimol: CYCS top-1, GAPDH rank-2) and CED shifts generations when the top node is removed (3/10 drugs, step-0 SAS = 30% under prefix-TCD). We introduce Transparency Debt to frame the systemic risk of deploying accurate but structurally unfaithful models, and call on the community to adopt SAS as a standard reporting metric alongsideaccuracy.

Introduction:

Graph-LLMs ground LLM reasoning in verifiable KGs such as PrimeKG [1], promising traceable
clinical diagnostics—but this promise rests on a precarious foundation.
Structural Hallucination occurs when a model produces a correct clinical ansIr by bypassing the
supplied graph and drawing on parametric memory instead of graph evidence.
To illustrate, consider the Phone-Charger Trap: even when the critical biological edge
(Imatinib, inhibits, BCR-ABL) is explicitly removed from the input graph, state-of-the-art mod-
els such as G-Retriever [2] continue to predict the same relationship. The model’s reasoning is
decoupled from the graph structure entirely—a finding I characterise formally through the CED
protocol and case study.

This failure exposes growing Transparency Debt . While models
maintain high benchmark accuracy, they remain structurally unfaithful. In clinical settings—where
the validity of a specific causal pathway (e.g., genomic mutation ->disease progression) is more
critical than a general probabilistic prediction—this decoupling is a safety risk.
I therefore argue that for clinical Graph-LLMs, structural faithfulness must be treated as a mandatory
prerequisite rather than an optional secondary property. My contributions are:

  • I expose Structural Hallucination via CED experiments on PrimeKG and formalise SAS(Jensen-Shannon Divergence) to quantify structural faithfulness .
  • I propose TCD, which hard-masks the LLM logit distribution to KG neighbours at inference time, making hallucination impossible by construction .
  • I extend TCD to GAT-TCD, using a 2-layer GAT to attention-rank the constraint set,achieving SAS ≈1.00 on primary therapeutic edges .
  • I show that PrimeKG-trained GAT weights improve semantic ranking and CED-shifted generations (e.g. artenimol: CYCS->GAPDH after node deletion; Section 7.5).
  • We evaluate on 5 drugs across BioGPT, demonstrating +97–100 pp KG-grounding improvement with zero hallucination by construction .

Full write up is here:

https://sumantapakira.medium.com/faithfulness-must-be-the-leading-evaluation-criterion-of-clinical-knowledge-graph-463ff77fe5d6

Any feedback is welcome.

I like this. The key point, as I understand it, is not just whether the answer is correct, but whether it is actually dependent on the declared graph structure. That is an important distinction.

Your counterfactual edge deletion test seems like a clean way to expose cases where the model answers from latent parametric knowledge rather than from the KG.

I am working on a broader framework called DESi (Dynamic Epistemic Sequencer), which treats this as a general operator-selection problem: different evidence substrates need different verification operators. In that framing, your method would be a strong graph-specific evidence-dependency operator.

So I think this may generalize beyond clinical KGs. It touches the larger problem of making LLM answers procedurally and evidentially faithful, not just superficially accurate.

If useful, here is the current DESi outline: