271 lines
16 KiB
Markdown
271 lines
16 KiB
Markdown
# AI Interactive Fiction
|
|
|
|
AI Interactive Fiction is a web and CLI text adventure prototype that combines a deterministic world model with LLM-assisted command interpretation and narrative output. The web client presents the story as an animated, novel-like book page with synchronized text animation, optional TTS, music, and sound effects.
|
|
|
|
## Quick Start
|
|
|
|
Use Node.js 22 LTS for development. The project accepts Node >= 18.17, but current development has been done on Node 22.
|
|
|
|
```powershell
|
|
nvm install 22
|
|
nvm use 22
|
|
npm install
|
|
npm run build
|
|
npm run dev
|
|
```
|
|
|
|
`npm run dev` and `npm run start` use `DEFAULT_GAME_ENGINE` from `.env` to choose the active engine. Supported values are `ink`, `yaml`, and `zcode`. The engine-specific scripts remain available when you want to bypass the default.
|
|
|
|
Set `PORT` to choose a port; the server will try the next few ports if the requested one is already in use. Current engine defaults are YAML `3001`, Z-code `3002`, and Ink `3003` before port fallback.
|
|
|
|
## Commands
|
|
|
|
```powershell
|
|
npm run dev # Start the web UI through ts-node/nodemon
|
|
npm run start # Build/run the configured default engine from dist/
|
|
npm run dev:ink # Start the Ink engine server, watch ink source, compile on restart
|
|
npm run dev:yaml # Start the YAML engine server
|
|
npm run dev:zcode # Start the Z-code engine server
|
|
npm run start:ink # Build and run the compiled Ink engine server
|
|
npm run build # Compile TypeScript
|
|
npm run test # Run Jest tests
|
|
npm run lint # Run ESLint on src/
|
|
npm run start:cli # Run the CLI interface
|
|
npm run dev:cli # Run the CLI interface through ts-node/nodemon
|
|
```
|
|
|
|
Each game engine also has `:debug` and `:inspect` variants. `:debug` enables engine-specific diagnostic logging. `:inspect` starts Node with the inspector and currently also enables that engine's debug flag, so it is the combined debug-plus-inspector mode.
|
|
|
|
## Docker / Coolify Ink Deployment
|
|
|
|
The included `Dockerfile` builds and serves the Ink engine only. Coolify can use the repository Dockerfile directly.
|
|
|
|
Set the Coolify environment variables from `coolify.env.example`; at minimum:
|
|
|
|
```text
|
|
NODE_ENV=production
|
|
DEFAULT_GAME_ENGINE=ink
|
|
PORT=3000
|
|
INK_CONFIG_FILE=./config/engines/ink.json
|
|
```
|
|
|
|
The container compiles TypeScript during image build and compiles the configured Ink source to JSON when the server starts.
|
|
|
|
## Configuration
|
|
|
|
Environment variables are loaded from `.env`.
|
|
|
|
- `PORT`: preferred web server port.
|
|
- `DEFAULT_GAME_ENGINE`: engine used by `npm run dev` and `npm run start`; one of `ink`, `yaml`, or `zcode`.
|
|
- `DEFAULT_WORLD_FILE`: YAML world file to load. Defaults to `./data/worlds/example_world.yml`.
|
|
- `OPENROUTER_API_KEY`: API key for LLM command interpretation.
|
|
- `OPENROUTER_MODEL`: OpenRouter model name.
|
|
|
|
TTS provider settings are configured in the browser options menu and persisted in browser storage. Providers currently include `none`, browser speech synthesis, Kokoro, ElevenLabs, OpenAI, and local OpenAI-compatible servers. Production should not assume a universal TTS default; the game or player state selects the active mode, and `none` is the safe fallback.
|
|
|
|
## Starting A Game
|
|
|
|
The web client no longer starts the game automatically. Browsers require a user gesture before audio playback, so the right page initially shows a start prompt and the command input is hidden. Use `new game` or `load` in the top bar to start.
|
|
|
|
The placeholder server API supports:
|
|
|
|
- `newGame()`
|
|
- `loadGame(slot)`
|
|
- `saveGame(slot)`
|
|
- `hasSaveGame(slot)`
|
|
- `getSaveGames()`
|
|
- `isGameRunning()`
|
|
|
|
Save slots are positive integers. Save behavior is engine-specific: the Ink client/server path persists Ink state, client history, choices, media state, and playback position for browser save/load; YAML and Z-code persistence still need regression testing and cleanup.
|
|
|
|
## Web Client
|
|
|
|
The browser app is built from native ES modules in `public/js/`. The loader dynamically imports modules, applies a cache-busting query string during development, resolves declared dependencies, and awaits module initialization in dependency order before the UI becomes usable.
|
|
|
|
Major modules:
|
|
|
|
- `module-registry.js`, `base-module.js`, `loader.js`: module lifecycle, dependency graph, progress overlay, state reporting.
|
|
- `text-processor-module.js`, `paragraph-layout-module.js`, `layout-renderer-module.js`: SmartyPants, language-aware hyphenation, Knuth-Plass line breaking, DOM rendering.
|
|
- `markup-parser-module.js`: story markup fallback for chapters, sections, Markdown emphasis, right-page glossary notes, images, SFX, and music.
|
|
- `sentence-queue-module.js`, `playback-coordinator-module.js`, `animation-queue-module.js`: sentence preparation, synchronized playback, timing, fast-forward.
|
|
- `tts-factory-module.js` plus provider modules: TTS provider selection, voice settings, speed mapping, caching, and playback.
|
|
- `audio-manager-module.js`: master, speech, music, and sound effect volume, music playback, sound effects, and music ducking.
|
|
- `ui-controller-module.js`, `ui-display-handler-module.js`, `ui-input-handler-module.js`, `options-ui-module.js`: book UI, command input, options, top-bar controls, and game API calls.
|
|
- `choice-display-module.js`: choice-mode UI, click selection, keyboard-letter assignment, and future choice-template routing.
|
|
|
|
The static server sends no-cache headers for local development so stale ES modules do not mask changes. If the browser console shows `onpage-dialog.preload.js:121 Uncaught ReferenceError: browser is not defined`, ignore it; that comes from the installed ad blocker, not this project.
|
|
|
|
## Story Markup
|
|
|
|
Plain paragraphs are rendered paragraph by paragraph. Normal following paragraphs are horizontally indented and do not get a blank line between them. Special block markers change the treatment of the next paragraph.
|
|
|
|
Inline Markdown emphasis:
|
|
|
|
```text
|
|
*italic* or _italic_
|
|
**bold** or __bold__
|
|
***bold italic*** or ___bold italic___
|
|
```
|
|
|
|
Right-page glossary notes:
|
|
|
|
```text
|
|
The train stops at Eibenreith.
|
|
#gloss[Eibenreith](A fictional alpine town in the Kaiserpunk setting.)
|
|
```
|
|
|
|
Glossary markup is a normal story tag scoped to the paragraph/block it is attached to. The UI finds every matching visible instance of the term in that right-page block and adds a hover/focus note. The tag itself is not displayed, is not sent to TTS, and is ignored by choices and command history. Avoid raw Ink control characters in the explanation; `|`, `{`, and `}` must be escaped in Ink as `\|`, `\{`, and `\}` if they are needed literally.
|
|
|
|
TTS reading instructions:
|
|
|
|
```text
|
|
„Ich habe nichts gesehen“, sagt Viktor.
|
|
#tts[Read softly, with controlled unease.]
|
|
```
|
|
|
|
`#tts[...]` is scoped to the paragraph/block it is attached to and is sent only to providers that support per-request reading instructions. This providerless form is the normal authoring style; `#tts(...)` is equivalent if parentheses read better. Provider-specific forms are also accepted for overrides, for example `#tts[openai](Read softly.)` or `#tts-openai[Read softly.]`. Currently only OpenAI `gpt-4o-mini-tts` consumes the instruction.
|
|
|
|
Write TTS instructions as concise performance direction: tone, emotion, intonation, pace, accent, or whispering/singing style. Keep the spoken words in the paragraph itself and use the tag only to guide delivery.
|
|
|
|
Canonical block/media/control tags use Ink-style `#` syntax. In Ink these are real Ink tags. In YAML and Z-code narrative output, leading `#...` lines are parsed by the server into the same structured `StoryTag` objects before reaching the client. The browser only consumes structured `TurnResult` objects.
|
|
|
|
Tag format:
|
|
|
|
```text
|
|
#key
|
|
#key[value]
|
|
#key[value](options)
|
|
#key:value
|
|
```
|
|
|
|
For Ink choices, put choice-local tags under the choice they belong to. Explicit keyboard letters are supported with `# letter[x]`, `#letter[x]`, or the colon form `#key:x`; the client reserves those keys first, then assigns the remaining visible choices from `1` through `0`, then `A` through `Z` in visible order. `#optional` renders the choice in italic. `# action[name]` or `#action:name` assigns an invisible action group: group order follows the first appearance of each action tag in the authored list, entries inside each group are randomized, and choices without an action tag are grouped last.
|
|
|
|
`#auto` marks an Ink choice that the browser should choose automatically instead of rendering as a visible button. Auto choices still need a developer-facing bracket choice text so they remain testable in Inky. `#auto(2)` waits two UI choice turns since the last global auto trigger. `#auto:keyword(2)` waits two UI choice turns only since the last auto trigger with the same keyword. Use the colon form for keyed auto tags on choice lines.
|
|
|
|
Chapter:
|
|
|
|
```text
|
|
#chapter[The Mysterious Mansion]
|
|
|
|
The first paragraph uses a drop cap and no first-line indent.
|
|
|
|
Following paragraphs use the normal paragraph indent.
|
|
```
|
|
|
|
The heading is centered, italic, and uses the same text face as the body. The first paragraph after a chapter marker is unindented and receives the drop cap treatment.
|
|
|
|
Section or text block:
|
|
|
|
```text
|
|
#section
|
|
|
|
The first paragraph starts a separated block without horizontal indent.
|
|
|
|
The following paragraph returns to the normal indent.
|
|
```
|
|
|
|
`#textblock` is treated the same way. The first paragraph after the marker is separated from previous content by one line of vertical space.
|
|
|
|
Images are story blocks:
|
|
|
|
```text
|
|
#image[mansion-rain.jpg](landscape)
|
|
#image[portrait-letter.jpg](portrait pause=2)
|
|
#image[seal.png](square lead=1.5)
|
|
```
|
|
|
|
Image file names are relative to `public/images/`. `landscape`/`widescreen` and `square` images are centered, near full page width, and line-snapped. `portrait` images sit beside prose at half page width. Image pauses (`pause=`, `delay=`, `lead=`, or a bare `2s`) are skippable and do not block background TTS preparation.
|
|
|
|
Sound effects are story tags:
|
|
|
|
```text
|
|
#sfx[squeaky-door.ogg]
|
|
#sfx[church-bells.ogg](max=8 fade fade-duration=2)
|
|
The door opens and the hall exhales.
|
|
```
|
|
|
|
The tag is parsed by the server into a `StoryTag` object. Sound effect paths are relative to `public/sounds/`. Optional parameters can limit playback (`max=`, `duration=`, `stop-after=`, `fade-after=`), choose the end mode (`fade` or `stop`/`cut`), and set `fade-duration=`.
|
|
|
|
Music can be placed as a block:
|
|
|
|
```text
|
|
#music[rain-theme.ogg](crossfade, loop, lead=4)
|
|
```
|
|
|
|
Music paths are relative to `public/music/`. Supported modes are `queue`, `crossfade`, and `cut`. Use `loop` or `once` to control repetition. `lead=<seconds>` delays the following text/TTS paragraph so the music can play alone before narration continues. To place that pause between a chapter heading and the dropcapped first paragraph, put the music tag after the chapter tag and before the first prose paragraph; TTS generation for the next spoken paragraph continues during the lead pause.
|
|
|
|
Game-state and player-message tags:
|
|
|
|
```text
|
|
#score[You found the quiet ending.]
|
|
#error[Ink story ended without an explicit ending tag.]
|
|
#achievement[First Steps]
|
|
#alert[Try examining objects before using them.]
|
|
```
|
|
|
|
`#score[...]` marks an intended ending and opens a localized ending popup when the turn reaches `inputMode: end`. `#error[...]` marks an unrecoverable ending and opens an error popup. If an Ink story runs out of content without an explicit `#score[...]` or `#error[...]`, the Ink engine emits an `#error[...]` tag. `#achievement[...]` and `#alert[...]` open localized queued popups while the game continues.
|
|
|
|
## Architecture Documentation
|
|
|
|
`SPECIFICATION.md` is the canonical architecture and implementation specification. `TODO.md` is the canonical progress and remaining-work list. The former loose Ink and Z-code inclusion notes have been folded into those two files.
|
|
|
|
## Assets
|
|
|
|
- `public/sounds/`: sound effects referenced by `#sfx[file]` tags.
|
|
- `public/music/`: background music referenced by `#music[file](...)` tags.
|
|
- `public/images/`: story images referenced by `#image[file](...)`.
|
|
- `public/fonts/`: font assets used by the book UI.
|
|
|
|
Keep third-party assets licensed for local redistribution, and document source and license in the folder README or alongside the file.
|
|
|
|
## Typography And Playback Behavior
|
|
|
|
The renderer is designed to behave like a scaled static book page. The page keeps its aspect ratio, and text sizes and word positions scale relative to the page instead of reflowing unpredictably at small browser sizes.
|
|
|
|
Text processing order:
|
|
|
|
1. Parse story markup and remove non-display media markers.
|
|
2. Apply Markdown emphasis spans and right-page glossary annotations.
|
|
3. Run SmartyPants for typographic punctuation.
|
|
4. Apply Hyphenopoly for the selected language.
|
|
5. Calculate line breaks with the Knuth-Plass algorithm.
|
|
6. Render absolutely positioned word spans and animate them in sync with audio or estimated duration.
|
|
|
|
When real TTS audio is available, animation duration is driven by measured audio length. With TTS disabled or unavailable, duration is estimated from text length and the persisted speed setting.
|
|
|
|
Fast-forwarding by page click or space completes the active animation and fades/stops current TTS playback so queued content can proceed.
|
|
|
|
The right page history is line-addressed rather than natively scrolled. The page has a fixed line count, all block heights snap to whole lines, and the custom scrollbar represents virtual history line position. The DOM keeps a moving window of history blocks around the active line instead of paginating the story.
|
|
|
|
## Changelog
|
|
|
|
### 2026-05-17
|
|
|
|
- Added Ink engine support with source compilation, engine config files, game metadata, locale-driven UI text, choice mode, keyboard choice letters, and one-list choice rendering.
|
|
- Added line-addressed right-page history, save/load reconstruction, image restoration, custom scrollbar plumbing, and virtual block-window rendering.
|
|
- Added story image rendering for landscape, portrait, and square images, including line-snapped sizing and portrait text exclusion.
|
|
- Added localized popups for endings, errors, achievements, and alerts through the tag channel.
|
|
- Added credits and third-party license UI.
|
|
- Added per-volume mute toggles and configurable music ducking amount.
|
|
- Added German typography handling for dialogue guillemets based on game metadata language.
|
|
|
|
### 2026-05-14
|
|
|
|
- Consolidated usage, markup, and architecture documentation into `README.md` and `TODO.md`.
|
|
- Added no-cache static serving and module URL cache busting so browser reloads pick up JS changes reliably during development.
|
|
- Fixed module loader dependency ordering so modules are initialized only after their declared dependencies are ready.
|
|
- Added the placeholder game API for `newGame`, `loadGame`, `saveGame`, `hasSaveGame`, `getSaveGames`, and `isGameRunning`.
|
|
- Changed the web UI to require a manual game start before showing the command input, which keeps browser audio autoplay restrictions manageable.
|
|
- Implemented story markup for chapters, text blocks, Markdown emphasis, image placeholders, sound effects, and music cues.
|
|
- Added music block parameters for playback mode, loop/once behavior, and lead-in delay.
|
|
- Added sound and music asset folders and playback plumbing for sound effects and background music.
|
|
- Added music ducking while TTS is active.
|
|
- Reworked book typography around Knuth-Plass line breaking, Hyphenopoly hyphenation, SmartyPants, paragraph indentation rules, drop caps, and responsive page scaling.
|
|
- Reworked TTS provider behavior, speed mapping, persistence, caching keys, top-bar/options synchronization, and OpenAI voice validation.
|
|
- Added development notes for ignoring the unrelated ad-blocker console error.
|
|
|
|
### Earlier Prototype Work
|
|
|
|
- Established the original animated fiction prototype with inkjs, SmartyPants, Hyphenopoly, Knuth-Plass line breaking, custom animation scheduling, save/load concepts, and media tags.
|
|
- Split the client from a monolithic prototype into focused modules for text processing, layout, animation, audio, persistence, TTS, and UI control.
|