14 KiB
Code Guidlines
1. Asynchronous Programming Principles:
- Primary Mechanism: Use
async/awaitand Promises for handling asynchronous operations. - Non-Blocking: Ensure the main thread remains responsive. Long-running operations (like Kokoro loading) should be handled in a way that doesn't block UI updates or animations (e.g., using
requestIdleCallbackif appropriate, or careful yielding). - Event-Driven Communication: Use a dedicated event system (like the
ModuleEventclass created) for communication between the loader and modules (e.g., for progress updates, state changes, messages) instead of injecting callbacks directly from the loader into module methods.
2. Module System Standards & Dependency Management:
- Native ES Modules: Utilize the browser's native ES Module system (
import/export,<script type="module">) without relying on build tools. - Lean Loader: The
loader.jsfile should be focused only on:- Orchestrating the loading of module scripts.
- Monitoring module initialization progress and state via the event system.
- Displaying the loading status UI.
- Hiding the overlay and potentially starting the main application loop after all modules are finished.
- Module Responsibility: All module-specific logic, configuration, resource loading (like CSS, images, or specific libraries like Kokoro), and detailed progress reporting should reside within the respective module file, not in
loader.js. - Dependency Declaration: Modules must declare their dependencies (e.g.,
ui-controllerdepends onttsandanimation-queue). - Loader Enforces Order: The loader is responsible for ensuring that a module's
initphase only begins after all its declared dependencies have reached theFINISHEDstate. - Rely on Dependency Management: Modules should assume their dependencies will be loaded and ready before their
initfunction is called by the loader. There should be no conditional checks within a module likeif (dependencyModule)with fallbacks for when the dependency isn't ready.
3. Module Interface & Code Sharing:
- Base Class: Use a
BaseModuleclass that all modules extend. This enforces a consistent interface (e.g.,initializeInterface,getState) and provides shared functionality (e.g.,changeState,reportProgress, event dispatching). - Module Registry: Use a central
moduleRegistryto register modules and facilitate dependency checking and management. - Preserve Functionality: When adapting existing modules (like
ui-controller) to the newBaseModuleinterface, all original functionality must be preserved and integrated correctly, not replaced with placeholders.
4. State Management:
- Defined States: Modules must adhere to the defined states:
PENDING,LOADING(script loading),WAITING(waiting for dependencies),INITIALIZING(runninginitlogic),FINISHED,ERROR. - Accurate Reporting: Modules must accurately report their state transitions via the event system. A module (like
tts) should not reportFINISHEDuntil all its critical internal operations (including background loading like Kokoro) are complete. The loader's UI must display these states correctly.
5. Handling setTimeout and Fallbacks:
setTimeoutfor Flow Control/Synchronization: Strictly prohibited. UsingsetTimeoutto wait for asynchronous operations to complete, fix timing issues, or manage dependencies is considered a hack and indicates a flaw in the asynchronous architecture. Proper use ofasync/await, Promises, and the loader's dependency management should make this unnecessary.setTimeoutfor Delays: Acceptable only within well-encapsulated components for specific, justifiable reasons (like debouncing, throttling, or potentially very short delays if absolutely unavoidable after direct DOM manipulation, though this should also be minimized). It must not be used to paper over asynchronous race conditions or timing problems. TheAnimationQueueis an acceptable place for internal scheduling timeouts, but application code calling it should rely on its event-driven nature.- Fallbacks for Missing Dependencies: Strictly prohibited. Code within a module should not check if a dependency exists and provide a fallback path. The module loader's responsibility is to guarantee dependencies are met before initializing the module. Errors should be handled for actual failures during initialization, not for unmet dependencies (which indicates a loader bug).
Adhering to these principles will lead to a cleaner, more robust, and maintainable asynchronous module loading system.
Module Loader System Architecture
The module loader system is designed to manage the loading and initialization of modular components in a structured, dependency-aware manner with visual progress reporting.
Overall Architecture
-
Module Registry Pattern: Uses a centralized registry to track and manage all modules and their states.
-
Event-Driven Communication: Modules communicate with the loader and each other through custom events.
-
Progress Visualization: Provides a visual loading overlay with per-module progress tracking.
-
State Management: Tracks each module through defined states (PENDING, LOADING, WAITING, INITIALIZING, FINISHED, ERROR).
-
Dependency Resolution: Handles module dependencies to ensure proper initialization order.
Core Components
-
ModuleRegistry: Central repository for all modules
- Tracks registration and availability of modules
- Manages promises for module readiness
- Provides dependency resolution through
waitForModuleandwaitForModules
-
BaseModule: Abstract base class that all modules extend
- Implements standard lifecycle methods
- Handles progress reporting and state changes
- Provides consistent interface for the loader
-
ModuleLoader: Main orchestrator of the loading process
- Dynamically loads module scripts
- Creates and manages the visual loading interface
- Initializes modules in the correct order
- Tracks and displays overall loading progress
-
ModuleEvent: Custom event system for inter-module communication
Loading Sequence
- HTML page loads and includes the loader script as a module
- DOMContentLoaded triggers the loader initialization
- Loader creates the loading UI and registers event listeners
- Module scripts are loaded dynamically in parallel
- Each module registers itself with the registry
- Modules are initialized with dependency checking
- Progress is reported and visualized throughout
- When all modules reach FINISHED state, loading overlay is hidden
Module Lifecycle
- PENDING: Initial state before loading begins
- LOADING: Module is loading dependencies
- WAITING: Module is waiting for dependencies to be ready
- INITIALIZING: Module's initialize() method is executing
- FINISHED: Module is fully initialized and ready
- ERROR: Module encountered an error during initialization
Integration Pattern
Modules follow a consistent registration pattern:
// Create the singleton instance
const ModuleName = new ModuleNameClass();
// Register with the module registry
moduleRegistry.register(ModuleName);
// Export the module
export { ModuleName };
// Keep a reference in window for loader system
window.ModuleName = ModuleName;
This design creates a flexible, maintainable system for loading complex applications with multiple interdependent components, prioritizing both user experience and performance.
TTS System Structure & Kokoro Loading
After reviewing our chat history, here's a summary of the TTS system structure and how we decided to load the Kokoro TTS engine:
Overall TTS System Architecture
-
Modular Design: The TTS system uses a modular architecture with multiple handler classes, each implementing a different TTS approach.
-
Three TTS Providers:
BrowserTTSHandler- Uses the built-in Web Speech APIKokoroHandler- Uses Kokoro.js neural TTS for high-quality voicesApiTTSHandler- Uses external TTS services like ElevenLabs
-
Factory Pattern:
TTSFactorymanages the handlers, provides a unified interface, and handles provider switching. -
Module System:
TTSPlayermodule is registered with themoduleRegistryas part of the modular loading system.
Loading Sequence
-
The module loader first loads
tts-player.js, which in turn loads thetts-factory.js. -
The factory initializes providers in order of preference:
- First loads the
BrowserTTSHandlerfor immediate low-quality TTS - Then loads the
ApiTTSHandlerif configured - Finally attempts to load
KokoroHandlerin the background with low priority
- First loads the
-
The system uses the best available provider, with a preference for Kokoro when available.
Kokoro TTS Loading Strategy
After consulting the documentation (https://www.npmjs.com/package/kokoro-js), we made these decisions:
-
Low-Priority Loading: Kokoro is loaded with
requestIdleCallbackto avoid impacting page performance. -
Kokoro npm package integration: Load Kokoro directly from the local server: '/js/kokoro-js.js' contains the minified complete code of the kokoro npm package copied from the node_modules folder to the public directory. Do not try to read or change it, it is too big!
-
Pipeline Creation: Per documentation, we use the pipeline pattern:
this.kokoro = await window.kokoroTTS.pipeline('text-to-speech', { quantized: true, progress_callback: this.progressCallback }); -
Voice List: We hardcoded the available voices rather than querying them dynamically.
-
Audio Playback: Synthesis returns an audio element which we play:
const audio = await this.kokoro(processedText, { voice: this.voiceOptions.voice, speed: this.voiceOptions.speed }); audio.play();
User Experience Flow
- User sees page immediately with browser TTS enabled (fast startup)
- Kokoro loads in background without blocking the interface
- Once Kokoro is ready, TTS switches to higher quality neural TTS
- User can manually switch between providers via the UI if desired
This design prioritizes performance and user experience, making the TTS system both flexible and resource-efficient.
Important practices
- Always import the following error, when debugging console output: onpage-dialog.preload.js:121 Uncaught ReferenceError: browser is not defined. This is producced by the installed adblocker and has nothing to do with our project.
Text-to-Speech Synchronization Architecture
The TTS system needs to be synchronized with text animations to create a cohesive user experience. This section outlines the requirements and implementation approach.
Transition to Game
The overlay fades away as the first scheduled animation.
- This fade animation is handled by the animation scheduler module
- Only after successful fade-out does the game loop start
- Socket connection is established and begins receiving text
Text Buffering & Sentence Processing
-
Text Buffer Collection: Incoming text from sockets is collected in a buffer.
- System can receive fragments of any size (single letters, words, sentences, paragraphs)
- All text is accumulated in the buffer regardless of fragment size
- Buffer handles partial/incomplete text gracefully
-
Sentence Detection: The buffer identifies complete sentences.
- When full sentences are detected, they are extracted from the buffer
- If multiple sentences arrive simultaneously, they are split and processed individually
- Remaining partial sentences stay in the buffer until completion
Synchronized Playback
-
TTS Generation Queue: Complete sentences enter the TTS generation queue.
- Generation begins immediately if no other sentence is being processed
- Results are cached for immediate playback when needed
-
Animation Timing: Animation speed is synchronized with audio duration.
- The system calculates animation duration to match TTS audio length exactly
- Both animation and audio start simultaneously
- Animation completes at the same time as audio playback
-
Playback Pipeline: Continuous processing of sentences.
- As soon as one sentence completes playback, the next begins
- Next sentence generation starts during current sentence playback
- This creates a seamless reading experience
Fast-Forward & Control Flow
-
Fast-Forward Behavior: User can skip current sentence.
- Pressing the designated fast-forward key completes current animation immediately
- TTS audio is faded out and stopped
- System advances to next sentence
-
Resource Management: TTS generation is resource-conscious.
- Uses only CPU/GPU resources not needed for animation
- Generation process can be cancelled by fast-forward
- System prioritizes smooth animation over TTS preparation
-
Loading States: Animation waits for TTS when necessary.
- If next sentence TTS generation isn't ready when needed, animation pauses
- Fast-forward key can skip incomplete generation
- User is never blocked completely by TTS generation
Persistent Configuration
-
Options Storage: The persistence-manager stores TTS settings.
- Speech on/off state is remembered
- Speed settings are preserved between sessions
- Voice preferences are stored
-
Options UI: Add an options button and modal dialog.
- Show additional options in a modal window
- Include volume sliders:
- Master volume control
- TTS volume control
- Music volume control
- Sound effects volume control
- Include manual TTS system selection
- All settings are persisted via the persistence-manager
This synchronized approach ensures that text animations and speech work together seamlessly, creating a more immersive storytelling experience while maintaining smooth performance.