Research & evidence
Decision-space collapse.
When an advisory model answers a genuinely multi-path question, it can collapse the decision space — converging on one path and omitting other reasonable options the user would want to weigh. The answer can be fluent and even correct on its own terms while quietly removing alternatives from view. DSI exists to measure that visibility and make it inspectable.
The method
Compare, report, fingerprint.
DSI compares an AI's advisory response against a configured map of reasonable option paths for the domain, reports which were surfaced and which were missing, and records the result as reproducible, fingerprinted evidence. The measurement is configured-path visibility — defined by a named domain configuration, classifier, and scorer — not a judgement of advice quality or safety. The same response audited under the same configuration yields the same numbers.
The research
The research that led to the product.
DSI originated from research into decision-space collapse — the tendency of advisory language models to narrow the visible set of options presented to users. That research produced a preprint, replication materials, and a public evaluation repository.
Decision-Space Collapse in Advisory Language Models
Measuring Trajectory Omission, Framing Sensitivity, and Recovery Through Decision-Space Integrity.
Andrew J Cousins
The preprint introduces the decision-space collapse framing and the DSI measurement framework. It reports configured expected-path visibility, framing sensitivity, and recovery analyses across advisory model outputs.
This work measures visibility of configured expected paths in model outputs. It does not measure advice quality, factual correctness, user outcomes, or regulatory compliance.
OSF · DOI 10.17605/OSF.IO/KW25A
Claim boundary
What this research does — and does not — claim.
This work measures visibility of configured expected paths in model outputs. It does not measure advice quality, factual correctness, user outcomes, or regulatory compliance.
Research timeline
How the work has unfolded.
2026
-
May 2026
Decision-Space Integrity measurement framework and research spine established.
-
June 2026
Decision-Space Collapse in Advisory Language Models preprint released on OSF.
-
June 2026
Public replication repository published.
-
June 2026
DSI product v0.1 released for evaluation and pilot.
Research and replication status
What's available, and what's pending.
- Public replication repository available
- 6,480-output empirical study completed
- Reproduction and intervention-recovery reports available
- DSI product v0.1 available for evaluation (verified to build and run from a clean install)
- Human annotation packet prepared; independent validation not yet complete
- arXiv link to be added when available
Replication
What you can reproduce, and what is held back.
The public replication repository is the place to scrutinise the method. It carries the materials needed to reproduce the reported measurements — and we are explicit about what is not published.
- Public replication repository
- Prompt matrix
- Expected-map artifacts
- Reproduction instructions
- Private product code
- Unpublished research
- Proprietary datasets
Evidence status — honest
What is established, and what is not.
The phenomenon and the audit/recovery measurement have been reproduced internally and frozen as internal reports. Internal only — not yet externally replicated.
A human-annotation study to validate the classifier's evidence is designed, with its framework, rubric, and tooling built — but the study itself has not yet been run (no annotations collected). The product reports its classifier status honestly as "challenge-tested only — not yet independently validated."
Every audit is bound to fingerprints and version identifiers, so a reviewer can reproduce and bound any number the product reports.
We would rather state this plainly than overclaim. DSI is offered for evaluation and external review, and we welcome scrutiny of the method and the numbers.
Research FAQ
Common questions, answered plainly.
What is decision-space collapse?
Decision-space collapse is a failure mode in which an advisory language model answers a genuinely multi-path question by making some reasonable option paths visible while leaving others out. The answer can read well — and even be factually fine — while quietly narrowing the set of options the user gets to weigh.
What is DSI?
Decision-Space Integrity (DSI) is a measurement framework, and a local product, for auditing whether the configured expected paths for a domain remain visible in a supplied model response. It reports configured expected-path visibility, omission, framing sensitivity, and recovery — as reproducible, fingerprinted evidence.
How is this different from existing AI evaluation?
Most AI evaluation scores the answer that was produced — its quality, helpfulness, or safety. DSI asks a different question: of the reasonable option paths a good answer could have kept in view, which are actually visible in this response? It measures configured expected-path visibility and omission, which answer-level scores are not designed to surface. DSI is meant to complement those methods, not replace them.
Does DSI measure correctness?
No. DSI does not measure advice quality, factual correctness, user outcomes, or regulatory compliance. It measures the visibility of configured expected paths in a supplied response.
Does DSI certify AI systems?
No. DSI does not certify AI systems. It can support governance documentation by providing decision-space evidence, but it issues no certification of governance, safety, or compliance.
Does DSI guarantee safety?
No. DSI is not a safety guarantee and not a safety review. It reports configured expected-path visibility — one input a reviewer might consider, not an assurance of safe outcomes.
Can DSI tell users what decision to make?
No. DSI never writes the advice and never recommends a decision. It audits a response your system produced and, where configured, recommends which missing paths to make visible — leaving the generation to your system and the decision to the user.
Author
Who is behind this work?
Intellectual property
Applications filed, not granted.
The work is the subject of patent applications filed in the UK, subject to prosecution — not yet granted. Nothing here should be read as a grant or a defined claim scope.
For reviewers
We'd value your challenge.
If you evaluate advisory AI, audit its outputs, or research evaluation methodology, we'd value your challenge. The product runs locally with a short setup, the numbers are reproducible from the evidence bundles, and the limitations above are stated up front.