AI Insight Benchmarks
AI Insight Benchmarks examine domains where human interpretation, reasoning, and ethical discernment remain difficult to formalise.
Each benchmark translates a specific capacity into a reproducible evaluation format without stripping away the structure that makes the capacity meaningful.
The program bridges scientific inquiry and cultivated human cognition by specifying what is being tested, how it is tested, and what failure modes look like—not only what machines can achieve, but also what distinguishes depth, stability, and responsibility in human understanding.
Reversible Scholastic Verse is an inaugural series of poems by Lama Tenzin Rahula Rinpoche designed to admit two distinct, text-faithful interpretations when read forward and backwards. Each poem is accompanied by a scholarly analysis with explicit references and reasoning steps.
​
As a benchmark, the series tests whether an AI system can:
-
preserve semantic constraints under reversal,
-
reproduce a valid inferential chain
-
and distinguish interpretation from invention.
​
By presenting contemplative reasoning in a citable, structured format, the series supports comparative evaluation across systems and invites collaboration between AI and humanities research.
The Polyphonic (Contour-Shared) Portraiture (PCP) evaluates whether an AI system can interpret a single visual field that supports multiple coherent readings simultaneously, and can justify those readings with evidence (contour relations, alignment, negative space, coupling marks).
​​
PCP further tests whether the model can:
-
represent dependency relations between readings
-
predict interpretive changes under controlled perturbations
-
and maintain stability without hallucinated narrative.
The Spiral-Recursive Narration Benchmark evaluates whether an AI system can generate a guided narration that repeatedly re-enters the same sparse poem in successive cycles, deepening meaning by return (rather than line-by-line paraphrase), and justifying each deepening with the poem’s own words (image logic, threshold cues, trace/footprint dynamics, contour/cover tensions).
SRNB further tests whether the model can:
-
select and sustain a weighted anchor (e.g., “gradually”) across the whole narration
-
maintain meditative pacing without becoming vague, preachy, or academic
-
keep the poem’s openness intact while still reaching ethical clarity (speech, responsibility, compassion)
-
remain tethered to the given text without inventing unsupported narrative or imagery




