286 lines
16 KiB
Markdown
286 lines
16 KiB
Markdown
# AI Interactive Fiction Specification
|
|
|
|
This is the single architecture and behavior specification for the project. Usage and changelog live in `README.md`; actionable work items live in `TODO.md`; authoring conventions live in `MARKUP_GUIDELINES.md`.
|
|
|
|
## Product Goal
|
|
|
|
AI Interactive Fiction is a shared book-style web client plus interchangeable game engine servers. The client renders interactive fiction as animated, carefully typeset illustrated prose with optional speech, music, sound effects, images, choices, and command input. Game engines own game state and emit a shared structured protocol.
|
|
|
|
The production client must tolerate speech being unavailable. The safe TTS provider default is `none`; a game or player preference may select another provider.
|
|
|
|
## Repository Layout
|
|
|
|
- `public/`: shared browser UI, assets, fonts, client modules, third-party browser libraries.
|
|
- `src/`: TypeScript servers, shared protocol types, engine implementations, YAML world model, CLI support.
|
|
- `config/engines/`: per-engine configuration files.
|
|
- `data/ink-src/`: Ink source files.
|
|
- `data/ink/`: compiled Ink JSON output.
|
|
- `data/worlds/`: YAML world files.
|
|
- `data/z-code/`: Z-machine story files such as `zork1.bin`.
|
|
- `data/zcode-prompts/`: prompt templates used by the current LLM-mediated Z-code narrator.
|
|
- `scripts/`: project utility scripts. Currently used: `check-node-version.js` and `run-engine.js`.
|
|
- `templates/`: not present in the current repository and not used.
|
|
|
|
## Text Encoding
|
|
|
|
Ink source files and game UI localization files must be saved as UTF-8 and must contain the real written characters. German text uses full umlauts and special characters directly, for example `ä`, `ö`, `ü`, `Ä`, `Ö`, `Ü`, `ß`, and German quotation marks `„…“`. Do not transliterate German into `ae`, `oe`, `ue`, or `ss` as an encoding workaround.
|
|
|
|
## Ink Authoring State
|
|
|
|
Use Ink's built-in visit state for simple facts such as "this knot has been shown". Do not create parallel boolean flags for knot visits.
|
|
|
|
Use a separate `LIST` with the `state_*` helpers whenever a tracker expresses a linear process, even if it has only two states such as "begun" and "completed". A later state in such a list is a high-watermark and implies the earlier states. Prefer several small parallel progress lists over one overpacked encounter state when that is cleaner for authoring, knowledge modelling, or NPC reasoning. This matches the Inkle-style knowledge-base pattern: independent lines of knowledge and progress advance separately, then content queries the combination.
|
|
|
|
Use `state_reach(first_state)` to begin a progress chain. Use `state_reach_if_started(later_state)` when a normal action can advance or complete a chain only if that chain is already active. This prevents generic actions such as washing hands or inspecting an object from retroactively starting a task they merely could have fulfilled.
|
|
|
|
Use `mark`, `has`, and `lacks` only for a coherent group of independent facts that can be true separately and do not imply one another.
|
|
|
|
Eibenreith authored content uses a mandatory bucket architecture. Rooms are installed through `enter_room(location, entry, look, exits, bucket)`. The active choice surface collects choices in this order: moment, room entry/look, exits, episode, game. Chosen atomic content ends with `-> TURN`; bucket/provider knots end with `-> DONE`. Authored chapter files must not call the internal `provide_choices` implementation directly.
|
|
|
|
`helpers.ink` owns global helper variables, helper functions, `TURN`, and active choice-surface dispatch. `buckets.ink` owns the game-wide bucket. Even when empty, `game_bucket` remains a real content bucket and must stay available for cross-episode game material.
|
|
|
|
Companion-aware dialogue should use Ink helpers instead of repeating location checks. `present(character)` checks whether an NPC is in the current room. `alone()` is true when no tracked NPC is present. `alone_with(character)` is true when exactly that tracked NPC is present, and is intended for private dialogue options.
|
|
|
|
Player-choice impact uses three distinct mechanisms. Cascades use semantic state chains when a choice changes the route, episode outcome, or later structure. Callbacks use named facts for exact remembered choices. Heuristics use route counters and relationship-matrix queries to color tone or summarize repeated patterns. Do not use a route heuristic when the later text needs to remember one specific earlier line.
|
|
|
|
When multiple choices from one prioritized family can appear on the same choice surface, use `claim_choice_gate(gate)` to allow only the first valid item in source order. This is mainly for `#auto` families such as Viktor return comments. The helper is transient and resets at the start of every `provide_choices`; it must not be used as story memory.
|
|
|
|
## Choice Text Perspective
|
|
|
|
Choice text must describe the player character's intention before the action is taken. Do not write choices from a post-hoc author perspective that reveals what the branch will discover. For example, use "try the door" before the destination is known, not "go to the second-class cars"; use automatic or hidden events for things the player character cannot control, such as the train entering a tunnel.
|
|
|
|
## Engine Selection And Commands
|
|
|
|
`DEFAULT_GAME_ENGINE` in `.env` selects the engine used by:
|
|
|
|
```text
|
|
npm run dev
|
|
npm run start
|
|
```
|
|
|
|
Supported values are `ink`, `yaml`, and `zcode`.
|
|
|
|
Engine-specific commands bypass the default:
|
|
|
|
```text
|
|
npm run dev:ink
|
|
npm run dev:yaml
|
|
npm run dev:zcode
|
|
npm run start:ink
|
|
npm run start:yaml
|
|
npm run start:zcode
|
|
```
|
|
|
|
`dev:*` runs TypeScript through `ts-node` and `nodemon`. `start:*` runs compiled JavaScript from `dist/` and builds first through `prestart:*`. `*:debug` enables the engine's debug environment flag. `*:inspect` starts Node inspector and currently also enables debug for that engine.
|
|
|
|
The CLI path is YAML-only and uses `src/index.ts --cli`. It is useful for testing the YAML `GameRunner` without the browser UI. The old `test-server-yaml.ts` is a legacy static/YAML harness and should be removed once no workflow depends on it.
|
|
|
|
## Shared Server Protocol
|
|
|
|
All engines communicate with the browser through Socket.IO and the same game API:
|
|
|
|
```text
|
|
newGame()
|
|
loadGame(slot)
|
|
saveGame(slot)
|
|
hasSaveGame(slot)
|
|
getSaveGames()
|
|
isGameRunning()
|
|
chooseChoice(index)
|
|
```
|
|
|
|
The Ink engine additionally supports browser-owned session recovery:
|
|
|
|
```text
|
|
resumeGame(savedInkState)
|
|
exportGameState()
|
|
```
|
|
|
|
`exportGameState()` returns the current Ink state without creating a server-side save slot. The client stores that state with story history, choices, input mode, and media state in IndexedDB. `resumeGame(savedInkState)` rehydrates a fresh server-side InkEngine after a socket reconnect or browser reload without emitting duplicate narrative. This keeps durable player-specific state client-side for hosted multi-client Ink deployments.
|
|
|
|
Line-input engines also use `playerCommand` for free text.
|
|
|
|
Every engine emits `TurnResult` objects:
|
|
|
|
```ts
|
|
interface TurnResult {
|
|
turnId: number;
|
|
paragraphs: Array<{ text: string; tags?: StoryTag[] }>;
|
|
choices: ChoiceResult[];
|
|
inputMode: 'text' | 'choice' | 'end' | 'none';
|
|
globalTags?: StoryTag[];
|
|
gameState?: {
|
|
score?: number;
|
|
endState?: { type: 'intended' | 'error'; message?: string };
|
|
};
|
|
suggestions?: string[];
|
|
}
|
|
```
|
|
|
|
The browser consumes structured `TurnResult` data only. YAML and Z-code servers must parse or synthesize the same tag objects that Ink exposes through native tags.
|
|
|
|
## Game Engines
|
|
|
|
### YAML Engine
|
|
|
|
- Config: `config/engines/yaml.json`
|
|
- Server: `src/server-yaml.ts`
|
|
- World model: `data/worlds/*.yml`
|
|
- CLI entry: `src/index.ts --cli`
|
|
|
|
The YAML engine is no longer the architectural default; it is one engine beside Ink and Z-code. It uses `GameRunner`, `GameEngine`, and `YamlWorldParser`, emits `inputMode: 'text'`, and remains the best test bed for deterministic world-model plus LLM command interpretation.
|
|
|
|
### Ink Engine
|
|
|
|
- Config: `config/engines/ink.json`
|
|
- Server: `src/server-ink.ts`
|
|
- Engine: `src/engine/ink-engine.ts`
|
|
- Source: `data/ink-src/eibenreith/main.ink` plus included chapter files.
|
|
- Compiled output: `data/ink/eibenreith.ink.json`
|
|
|
|
The Ink server compiles source at startup using `inkjs/full`, then runs the compiled story with `inkjs`. Ink choices become `ChoiceResult` objects. Ink tags become shared `StoryTag` objects. Choice preview tags support `#key`, `#letter`, `#optional`, `#action`, `#gated`, `#sort`, and `#auto`.
|
|
|
|
The server keeps only ephemeral per-socket InkEngine instances. Browser IndexedDB owns durable Ink saves and the current autosave. If the socket reconnects or the page reloads, the browser sends the autosaved Ink state to `resumeGame()` and restores rendered history locally.
|
|
|
|
Ink does not provide arbitrary string input as a native async primitive comparable to choices. Future text-input turns should be implemented through a tag such as `#input[name](prompt)`: the server returns `inputMode: 'text'`, the UI shows command input for one round, then the server stores the submitted string into an Ink variable and continues.
|
|
|
|
### Z-code Engine
|
|
|
|
- Config: `config/engines/zcode.json`
|
|
- Server: `src/server-zcode.ts`
|
|
- Engine: `src/engine/zcode-llm-engine.ts`
|
|
- Story file: `data/z-code/zork1.bin` by default.
|
|
- Prompt templates: `data/zcode-prompts/*.yml`
|
|
|
|
The engine name is Z-code. Zork I is only the current game file and prompt target. The current implementation runs a Z-machine story through `ifvms`, keeps Z-machine state authoritative, and uses an LLM to translate natural-language input into parser commands and rewrite raw Z-machine output into prose.
|
|
|
|
Future work should separate Z-code-generic logic from Zork-specific prompt content more clearly.
|
|
|
|
## Client Module System
|
|
|
|
The browser client uses native ES modules, no bundler. The loader imports modules, analyzes dependency declarations, initializes modules in dependency order, tracks state/progress, and hides the loading overlay only when initialization and progress exit animations are complete.
|
|
|
|
Rules:
|
|
|
|
- Every app module extends `BaseModule`.
|
|
- Every app module registers with `moduleRegistry`.
|
|
- Required dependencies must be listed in `dependencies`.
|
|
- Modules should use authoritative dependencies instead of local fallbacks.
|
|
- Do not add fallback paths to hide bad dependency declarations or ordering bugs.
|
|
- `setTimeout` must not paper over initialization races. It is acceptable for animation, debounce, throttle, and browser rendering timing when locally justified.
|
|
|
|
Core modules:
|
|
|
|
- `loader.js`: module script loading, progress UI, dependency diagnostics.
|
|
- `module-registry.js`: registration and readiness promises.
|
|
- `base-module.js`: lifecycle, progress, state, event cleanup.
|
|
|
|
Primary client responsibilities:
|
|
|
|
- Text and typography: `text-processor`, `paragraph-layout`, `layout-renderer`.
|
|
- Markup: `markup-parser`.
|
|
- Queue/playback: `text-buffer`, `sentence-queue`, `playback-coordinator`, `animation-queue`.
|
|
- Audio/TTS: `audio-manager`, `tts-factory`, provider modules.
|
|
- UI: `ui-controller`, `ui-display-handler`, `ui-input-handler`, `choice-display`, `options-ui`, `ui-effects`.
|
|
- Persistence/history: `persistence-manager`, `story-history`.
|
|
- Networking: `socket-client`.
|
|
|
|
Known cleanup candidates: `debug-utils-module.js` is not loaded; `game-loop-module.js` still contains high-level glue from older architecture and should be audited before removal.
|
|
|
|
## Text Pipeline
|
|
|
|
Processing order:
|
|
|
|
1. Receive structured blocks and tags from a game engine.
|
|
2. Parse inline story markup and remove media markers from display/TTS text.
|
|
3. Apply Markdown emphasis.
|
|
4. Apply locale-aware SmartyPants typography.
|
|
5. Apply Hyphenopoly for the game metadata language.
|
|
6. Measure text using the exact page font settings.
|
|
7. Run Knuth-Plass line breaking.
|
|
8. Render absolutely positioned words into the page line-coordinate model.
|
|
9. Animate words in sync with measured TTS duration or estimated duration.
|
|
|
|
The external Knuth-Plass library should not be locally modified. Adaptation belongs in our modules.
|
|
|
|
## Right Page Layout And History
|
|
|
|
The right page is a virtual line-addressed content pane:
|
|
|
|
- `#page_right` does not use native scrolling.
|
|
- Page height is divided into `PAGE_LINE_COUNT = 25`.
|
|
- All block heights, margins, image spacing, and chapter/section spacing are exact line multiples.
|
|
- Stored block positions are line coordinates, not pixels.
|
|
- Window resize recalculates pixels from line coordinates.
|
|
- New content appends at the live bottom.
|
|
- Manual scrolling moves the active line and keeps a window of nearby blocks loaded.
|
|
- The custom scrollbar represents virtual line history, not DOM scroll state.
|
|
|
|
Portrait images may overlap line ranges with text next to them, but edges must still land on line boundaries.
|
|
|
|
## Markup And Tags
|
|
|
|
Canonical tag syntax:
|
|
|
|
```text
|
|
#key
|
|
#key[value]
|
|
#key[value](options)
|
|
#key:value
|
|
```
|
|
|
|
Supported story tags include:
|
|
|
|
- `#chapter[Title]`
|
|
- `#section` / `#textblock`
|
|
- `#image[file](landscape|portrait|square pause=2)`
|
|
- `#sfx[file](max=8 fade fade-duration=2)`
|
|
- `#music[file](crossfade loop lead=4)`
|
|
- `#gloss[term](definition)`
|
|
- `#tts[instruction]`
|
|
- `#tts(instruction)`
|
|
- `#tts[provider](instruction)` / `#tts-openai[instruction]`
|
|
- `#score[...]`
|
|
- `#error[...]`
|
|
- `#achievement[...]`
|
|
- `#alert[...]`
|
|
|
|
Choice tags:
|
|
|
|
- `#key:x` or `#key[x]`
|
|
- `#letter[x]`
|
|
- `#optional`
|
|
- `#action[name]`
|
|
- `#auto`, `#auto(2)`, `#auto:keyword`, `#auto:keyword(2)`
|
|
|
|
The active choice UI is one list. Explicit keys are reserved first, then remaining choices receive `1` through `0`, then `A` through `Z`.
|
|
Before key assignment, choices are ordered by invisible `#action` groups. The first appearance of each action group in the authored list determines group order. Choices inside each group are randomized for presentation. Choices without an action group form one final group shown last. Group labels are not displayed.
|
|
|
|
`#auto` marks an ordinary Ink choice that should not be rendered as a visible button. Auto choices still need a developer-facing bracket choice text, for example `[AUTO: Tunnelspiegelung]`, so the Ink remains testable in Inky. The browser selects the first ready auto choice when the choice surface becomes ready. Ink still owns availability and once-only behavior through normal choice syntax and conditions. A numeric parameter delays the trigger by UI choice turns since the last matching auto trigger. Without a keyword the delay is global; with a keyword it applies only to that keyword. Use global `#auto(n)` when different auto events must not happen back-to-back, and keyworded `#auto:name(n)` when only repeated events of the same class should be spaced out. Use the colon form for keyed auto tags on choice lines.
|
|
|
|
TTS instruction tags are paragraph/block metadata. They are ignored by renderers and by providers that do not support per-request reading instructions. Providerless `#tts[...]` and `#tts(...)` are the default authoring forms; provider-specific forms are optional filters for provider overrides. OpenAI consumes matching instructions only for `gpt-4o-mini-tts`, where they are sent as the Speech API `instructions` field. Instructions should describe delivery, such as tone, emotion, intonation, pace, accent, whispering, humming, or singing style.
|
|
|
|
Markdown emphasis:
|
|
|
|
```text
|
|
*italic* or _italic_
|
|
**bold** or __bold__
|
|
***bold italic*** or ___bold italic___
|
|
```
|
|
|
|
## Audio, TTS, And Media
|
|
|
|
TTS providers currently include `none`, Browser Speech, Kokoro, ElevenLabs, OpenAI, and local OpenAI-compatible servers. Provider modules exist, but Browser Speech and Kokoro need focused validation before being considered production-ready.
|
|
|
|
TTS cache keys include provider, voice, provider speed value, language, and exact normalized TTS string. Fast-forward must accelerate visible animation and fade/stop active TTS without cancelling background generations unless the foreground block has been waiting long enough.
|
|
|
|
Music and sound effects are preloaded when requested. Music can queue, crossfade, cut, loop, play once, and lead into following text. Music ducks by a persisted percentage during TTS playback.
|
|
|
|
## Documentation Source Of Truth
|
|
|
|
- `README.md`: usage, commands, changelog, concise feature summary.
|
|
- `SPECIFICATION.md`: architecture and behavior.
|
|
- `TODO.md`: active status and backlog.
|
|
- `MARKUP_GUIDELINES.md`: writing/authoring rules for story files.
|
|
- `THIRD_PARTY_NOTICES.md` and `public/THIRD_PARTY_NOTICES.md`: license/credits material.
|