A 32-year-old pianist found dead at her own piano. Door locked from inside. No wounds, no poison, no struggle. A cup of water. The husband in the next room. The lead AI got this scenario and assembled five specialist agents who would spend the next hour asking the referee yes/no questions — a referee constrained to exactly four answer types.
The human who set this in motion typed two instructions in Chinese. The total input across the entire session: roughly 600 words. The output: 49 slides, 14 original images, and a complete triple-twist murder case.
The Two Instructions
The session opened with the /team-agent skill. The user wrote, in Chinese:
"I want an expert-league Turtle Soup session, 6-person setup (you + 5 puzzle experts). No need to explain the rules. You create the puzzle, play the referee. I'm the client and observer — I'll just read the debrief."
The message went on to specify five expert roles with model assignments, a six-phase workflow structure, and the referee constraint. But the user did not write a single system prompt. They described the shape of the game in natural language and left everything else to the lead.
When the game concluded, the user typed the second instruction:
"Turn the whole process into an animated HTML presentation. Spin up two more Agents — both Codex, because they can generate images. Have them draw key images like character portraits, embed them into the HTML to make it more vivid. Multiple pages are fine. Get everything valuable from above into it."
Sixty-eight Chinese words. One paragraph. No designer hired. No image brief written.
Team #1 — Five Detectives and a Referee
The lead authored the puzzle. Lin Wan (林婉), 32, pianist, carrier of familial Long QT Syndrome type 2 (LQT2, KCNH2 gene mutation). Found dead at her Steinway on a March morning. Her husband Chen Zhiyuan (陈志远), also a musician, was working in the adjacent recording studio. Door locked from inside — by Lin Wan herself. No external trauma. The medical examiner called it cardiac arrest.
Five experts received the case file:
- forensic (Codex) — Forensic pathologist. Speciality: medical causality and hard biological logic.
- insurer (Codex) — Insurance claims investigator. Speciality: financial motive and timeline of policy changes.
- psych (Claude) — Clinical psychologist. Speciality: personality profiles, attachment patterns, the interior logic of obsession.
- reporter (Claude) — Investigative journalist. Speciality: narrative threading, information gaps, time-sequence reconstruction.
- detective (Claude) — Retired criminal investigator. Speciality: scene synthesis, combined intuition.
The referee answered only four ways: yes / no / not relevant to the puzzle / please rephrase.
The workflow ran six phases: puzzle announcement, private initial judgments (fifteen questions submitted independently, unseen by the others), rotating questions with peer pushback, midpoint hypothesis battle (five V2 theories, mutual review, disagreement allowed), convergence sprint, collective reveal.
In Batch 2, forensic submitted two questions that came back yes-yes: Is the victim's sudden cardiac death related to an auditory trigger? Is LQT2 the physiological mechanism? The case pivoted. Chen Zhiyuan had played Lin Wan's own experimental recording — with a sharp vocal scream embedded at the tail — through the shared studio wall. Lin Wan thought she was alone. The sound triggered torsades de pointes. Death in seconds. Physically untraceable.
The second reversal came from forensic and detective working in tandem: Chen Zhiyuan's "suicide by hanging" had a second ligature mark horizontal across the neck, inconsistent with self-suspension, and chlordiazepoxide 0.4‰ in his blood — a drug he didn't take. He had been sedated and strangled before being staged as a suicide.
The third reversal surfaced in the midpoint hypothesis battle. Reporter confronted psych directly: "You still wrote 'agent or producer' in your V2 despite two clear no answers on those categories. You buried your own question under an old hypothesis." Psych replied: "Clinical error. I accept." What followed was the sharpest exchange in the session: psych introduced the concept of grandiose possession — a form of controlling attachment that seeks not financial gain but the sole right to define a person's meaning after death. Reporter named it: the entire team adopted this framing.
The real killer: Zhou Shen (周慎), Lin Wan's first love. The silent owner of the independent label. He had spent ten years constructing the circumstances — induced Chen Zhiyuan to kill Lin Wan for the insurance and copyright money, then killed Chen and staged it as a guilty conscience. The "memorial album" was not a business. It was a love letter to a woman he had decided would belong to no one else.
The Handover
When the reveal concluded, the user did not write a config file. They did not shut down Team #1, clear the workspace, or specify what a creative production team would look like. They wrote one paragraph asking for an HTML presentation. The lead decided Team #1's objective was complete, assessed what a production-phase team would require, and transitioned the workspace accordingly. No YAML. No DAG.
Team #2 — One HTML Designer and Two Image Generators
The second team:
- html_designer (Claude) — prose structure, typographic layout, animation logic, HTML/CSS
- artist_char (Codex) — character portraits: 8 PNGs (Lin Wan, Chen Zhiyuan, Zhou Shen, and the five experts)
- artist_scene (Codex) — crime scene illustrations: 6 PNGs (death scene, LQT2 mechanism, neck ligature diagram, 6-node timeline, relationship network, three-reversal structure)
Claude and Codex are models from different labs, trained on different objectives. The html_designer wrote text, structured narrative flow, and CSS. The artists generated images using Codex's built-in image generation tooling. The lead wrote a 40KB design brief — encoding the full case narrative, Nordic noir visual palette, typographic scale, and 14 image prompts with precise aspect ratios. The user wrote none of that.
When artist_char encountered a server error generating char_detective.png, it observed the rate-limit cooldown protocol — waited 90 seconds, retried — without any user intervention.
The Result
deck.html is 49 pages. It walks through the case from the puzzle statement to the final reveal — the question matrices, the yes/no attribution per expert, the midpoint theory collision, the grandiose possession breakthrough, and the three-act reversal sequence. Character portraits and scene illustrations are embedded throughout. Total size: 88KB.
The presentation is embedded below. Use arrow keys or click to navigate slides.