259 lines
14 KiB
Markdown
259 lines
14 KiB
Markdown
# Code Guidlines
|
|
|
|
**1. Asynchronous Programming Principles:**
|
|
|
|
* **Primary Mechanism:** Use `async`/`await` and Promises for handling asynchronous operations.
|
|
* **Non-Blocking:** Ensure the main thread remains responsive. Long-running operations (like Kokoro loading) should be handled in a way that doesn't block UI updates or animations (e.g., using `requestIdleCallback` if appropriate, or careful yielding).
|
|
* **Event-Driven Communication:** Use a dedicated event system (like the `ModuleEvent` class created) for communication between the loader and modules (e.g., for progress updates, state changes, messages) instead of injecting callbacks directly from the loader into module methods.
|
|
|
|
**2. Module System Standards & Dependency Management:**
|
|
|
|
* **Native ES Modules:** Utilize the browser's native ES Module system (`import`/`export`, `<script type="module">`) without relying on build tools.
|
|
* **Lean Loader:** The `loader.js` file should be focused *only* on:
|
|
* Orchestrating the loading of module scripts.
|
|
* Monitoring module initialization progress and state via the event system.
|
|
* Displaying the loading status UI.
|
|
* Hiding the overlay and potentially starting the main application loop *after* all modules are finished.
|
|
* **Module Responsibility:** All module-specific logic, configuration, resource loading (like CSS, images, or specific libraries like Kokoro), and detailed progress reporting should reside *within* the respective module file, not in `loader.js`.
|
|
* **Dependency Declaration:** Modules must declare their dependencies (e.g., `ui-controller` depends on `tts` and `animation-queue`).
|
|
* **Loader Enforces Order:** The loader is responsible for ensuring that a module's `init` phase only begins *after* all its declared dependencies have reached the `FINISHED` state.
|
|
* **Rely on Dependency Management:** Modules should *assume* their dependencies will be loaded and ready before their `init` function is called by the loader. There should be **no** conditional checks within a module like `if (dependencyModule)` with fallbacks for when the dependency isn't ready.
|
|
|
|
**3. Module Interface & Code Sharing:**
|
|
|
|
* **Base Class:** Use a `BaseModule` class that all modules extend. This enforces a consistent interface (e.g., `initializeInterface`, `getState`) and provides shared functionality (e.g., `changeState`, `reportProgress`, event dispatching).
|
|
* **Module Registry:** Use a central `moduleRegistry` to register modules and facilitate dependency checking and management.
|
|
* **Preserve Functionality:** When adapting existing modules (like `ui-controller`) to the new `BaseModule` interface, all original functionality must be preserved and integrated correctly, not replaced with placeholders.
|
|
|
|
**4. State Management:**
|
|
|
|
* **Defined States:** Modules must adhere to the defined states: `PENDING`, `LOADING` (script loading), `WAITING` (waiting for dependencies), `INITIALIZING` (running `init` logic), `FINISHED`, `ERROR`.
|
|
* **Accurate Reporting:** Modules must accurately report their state transitions via the event system. A module (like `tts`) should not report `FINISHED` until all its critical internal operations (including background loading like Kokoro) are complete. The loader's UI must display these states correctly.
|
|
|
|
**5. Handling `setTimeout` and Fallbacks:**
|
|
|
|
* **`setTimeout` for Flow Control/Synchronization:** **Strictly prohibited.** Using `setTimeout` to wait for asynchronous operations to complete, fix timing issues, or manage dependencies is considered a hack and indicates a flaw in the asynchronous architecture. Proper use of `async`/`await`, Promises, and the loader's dependency management should make this unnecessary.
|
|
* **`setTimeout` for Delays:** Acceptable *only* within well-encapsulated components for specific, justifiable reasons (like debouncing, throttling, or potentially *very* short delays *if absolutely unavoidable* after direct DOM manipulation, though this should also be minimized). It must **not** be used to paper over asynchronous race conditions or timing problems. The `AnimationQueue` is an acceptable place for internal scheduling timeouts, but application code calling it should rely on its event-driven nature.
|
|
* **Fallbacks for Missing Dependencies:** **Strictly prohibited.** Code within a module should not check if a dependency exists and provide a fallback path. The module loader's responsibility is to guarantee dependencies are met before initializing the module. Errors should be handled for *actual* failures during initialization, not for unmet dependencies (which indicates a loader bug).
|
|
|
|
Adhering to these principles will lead to a cleaner, more robust, and maintainable asynchronous module loading system.
|
|
|
|
# Module Loader System Architecture
|
|
|
|
The module loader system is designed to manage the loading and initialization of modular components in a structured, dependency-aware manner with visual progress reporting.
|
|
|
|
## Overall Architecture
|
|
|
|
1. **Module Registry Pattern**: Uses a centralized registry to track and manage all modules and their states.
|
|
|
|
2. **Event-Driven Communication**: Modules communicate with the loader and each other through custom events.
|
|
|
|
3. **Progress Visualization**: Provides a visual loading overlay with per-module progress tracking.
|
|
|
|
4. **State Management**: Tracks each module through defined states (PENDING, LOADING, WAITING, INITIALIZING, FINISHED, ERROR).
|
|
|
|
5. **Dependency Resolution**: Handles module dependencies to ensure proper initialization order.
|
|
|
|
## Core Components
|
|
|
|
1. **ModuleRegistry**: Central repository for all modules
|
|
- Tracks registration and availability of modules
|
|
- Manages promises for module readiness
|
|
- Provides dependency resolution through `waitForModule` and `waitForModules`
|
|
|
|
2. **BaseModule**: Abstract base class that all modules extend
|
|
- Implements standard lifecycle methods
|
|
- Handles progress reporting and state changes
|
|
- Provides consistent interface for the loader
|
|
|
|
3. **ModuleLoader**: Main orchestrator of the loading process
|
|
- Dynamically loads module scripts
|
|
- Creates and manages the visual loading interface
|
|
- Initializes modules in the correct order
|
|
- Tracks and displays overall loading progress
|
|
|
|
4. **ModuleEvent**: Custom event system for inter-module communication
|
|
|
|
## Loading Sequence
|
|
|
|
1. HTML page loads and includes the loader script as a module
|
|
2. DOMContentLoaded triggers the loader initialization
|
|
3. Loader creates the loading UI and registers event listeners
|
|
4. Module scripts are loaded dynamically in parallel
|
|
5. Each module registers itself with the registry
|
|
6. Modules are initialized with dependency checking
|
|
7. Progress is reported and visualized throughout
|
|
8. When all modules reach FINISHED state, loading overlay is hidden
|
|
|
|
## Module Lifecycle
|
|
|
|
1. **PENDING**: Initial state before loading begins
|
|
2. **LOADING**: Module is loading dependencies
|
|
3. **WAITING**: Module is waiting for dependencies to be ready
|
|
4. **INITIALIZING**: Module's initialize() method is executing
|
|
5. **FINISHED**: Module is fully initialized and ready
|
|
6. **ERROR**: Module encountered an error during initialization
|
|
|
|
## Integration Pattern
|
|
|
|
Modules follow a consistent registration pattern:
|
|
```javascript
|
|
// Create the singleton instance
|
|
const ModuleName = new ModuleNameClass();
|
|
|
|
// Register with the module registry
|
|
moduleRegistry.register(ModuleName);
|
|
|
|
// Export the module
|
|
export { ModuleName };
|
|
|
|
// Keep a reference in window for loader system
|
|
window.ModuleName = ModuleName;
|
|
```
|
|
|
|
This design creates a flexible, maintainable system for loading complex applications with multiple interdependent components, prioritizing both user experience and performance.
|
|
|
|
|
|
# TTS System Structure & Kokoro Loading
|
|
|
|
After reviewing our chat history, here's a summary of the TTS system structure and how we decided to load the Kokoro TTS engine:
|
|
|
|
## Overall TTS System Architecture
|
|
|
|
1. **Modular Design**: The TTS system uses a modular architecture with multiple handler classes, each implementing a different TTS approach.
|
|
|
|
2. **Three TTS Providers**:
|
|
- `BrowserTTSHandler` - Uses the built-in Web Speech API
|
|
- `KokoroHandler` - Uses Kokoro.js neural TTS for high-quality voices
|
|
- `ApiTTSHandler` - Uses external TTS services like ElevenLabs
|
|
|
|
3. **Factory Pattern**: `TTSFactory` manages the handlers, provides a unified interface, and handles provider switching.
|
|
|
|
4. **Module System**: `TTSPlayer` module is registered with the `moduleRegistry` as part of the modular loading system.
|
|
|
|
## Loading Sequence
|
|
|
|
1. The module loader first loads `tts-player.js`, which in turn loads the `tts-factory.js`.
|
|
|
|
2. The factory initializes providers in order of preference:
|
|
- First loads the `BrowserTTSHandler` for immediate low-quality TTS
|
|
- Then loads the `ApiTTSHandler` if configured
|
|
- Finally attempts to load `KokoroHandler` in the background with low priority
|
|
|
|
3. The system uses the best available provider, with a preference for Kokoro when available.
|
|
|
|
## Kokoro TTS Loading Strategy
|
|
|
|
After consulting the documentation (https://www.npmjs.com/package/kokoro-js), we made these decisions:
|
|
|
|
1. **Low-Priority Loading**: Kokoro is loaded with `requestIdleCallback` to avoid impacting page performance.
|
|
|
|
2. **Kokoro npm package integration**: Load Kokoro directly from the local server:
|
|
'/js/kokoro-js.js' contains the minified complete code of the kokoro npm package copied from the node_modules folder to the public directory. Do not try to read or change it, it is too big!
|
|
|
|
3. **Pipeline Creation**: Per documentation, we use the pipeline pattern:
|
|
```javascript
|
|
this.kokoro = await window.kokoroTTS.pipeline('text-to-speech', {
|
|
quantized: true,
|
|
progress_callback: this.progressCallback
|
|
});
|
|
```
|
|
|
|
4. **Voice List**: We hardcoded the available voices rather than querying them dynamically.
|
|
|
|
5. **Audio Playback**: Synthesis returns an audio element which we play:
|
|
```javascript
|
|
const audio = await this.kokoro(processedText, {
|
|
voice: this.voiceOptions.voice,
|
|
speed: this.voiceOptions.speed
|
|
});
|
|
audio.play();
|
|
```
|
|
|
|
## User Experience Flow
|
|
|
|
1. User sees page immediately with browser TTS enabled (fast startup)
|
|
2. Kokoro loads in background without blocking the interface
|
|
3. Once Kokoro is ready, TTS switches to higher quality neural TTS
|
|
4. User can manually switch between providers via the UI if desired
|
|
|
|
This design prioritizes performance and user experience, making the TTS system both flexible and resource-efficient.
|
|
|
|
# Important practices
|
|
|
|
- Always import the following error, when debugging console output: onpage-dialog.preload.js:121 Uncaught ReferenceError: browser is not defined. This is producced by the installed adblocker and has nothing to do with our project.
|
|
|
|
# Text-to-Speech Synchronization Architecture
|
|
|
|
The TTS system needs to be synchronized with text animations to create a cohesive user experience. This section outlines the requirements and implementation approach.
|
|
|
|
## Transition to Game
|
|
|
|
The overlay fades away as the first scheduled animation.
|
|
- This fade animation is handled by the animation scheduler module
|
|
- Only after successful fade-out does the game loop start
|
|
- Socket connection is established and begins receiving text
|
|
|
|
## Text Buffering & Sentence Processing
|
|
|
|
1. **Text Buffer Collection**: Incoming text from sockets is collected in a buffer.
|
|
- System can receive fragments of any size (single letters, words, sentences, paragraphs)
|
|
- All text is accumulated in the buffer regardless of fragment size
|
|
- Buffer handles partial/incomplete text gracefully
|
|
|
|
2. **Sentence Detection**: The buffer identifies complete sentences.
|
|
- When full sentences are detected, they are extracted from the buffer
|
|
- If multiple sentences arrive simultaneously, they are split and processed individually
|
|
- Remaining partial sentences stay in the buffer until completion
|
|
|
|
## Synchronized Playback
|
|
|
|
1. **TTS Generation Queue**: Complete sentences enter the TTS generation queue.
|
|
- Generation begins immediately if no other sentence is being processed
|
|
- Results are cached for immediate playback when needed
|
|
|
|
2. **Animation Timing**: Animation speed is synchronized with audio duration.
|
|
- The system calculates animation duration to match TTS audio length exactly
|
|
- Both animation and audio start simultaneously
|
|
- Animation completes at the same time as audio playback
|
|
|
|
3. **Playback Pipeline**: Continuous processing of sentences.
|
|
- As soon as one sentence completes playback, the next begins
|
|
- Next sentence generation starts during current sentence playback
|
|
- This creates a seamless reading experience
|
|
|
|
## Fast-Forward & Control Flow
|
|
|
|
1. **Fast-Forward Behavior**: User can skip current sentence.
|
|
- Pressing the designated fast-forward key completes current animation immediately
|
|
- TTS audio is faded out and stopped
|
|
- System advances to next sentence
|
|
|
|
2. **Resource Management**: TTS generation is resource-conscious.
|
|
- Uses only CPU/GPU resources not needed for animation
|
|
- Generation process can be cancelled by fast-forward
|
|
- System prioritizes smooth animation over TTS preparation
|
|
|
|
3. **Loading States**: Animation waits for TTS when necessary.
|
|
- If next sentence TTS generation isn't ready when needed, animation pauses
|
|
- Fast-forward key can skip incomplete generation
|
|
- User is never blocked completely by TTS generation
|
|
|
|
## Persistent Configuration
|
|
|
|
1. **Options Storage**: The persistence-manager stores TTS settings.
|
|
- Speech on/off state is remembered
|
|
- Speed settings are preserved between sessions
|
|
- Voice preferences are stored
|
|
|
|
2. **Options UI**: Add an options button and modal dialog.
|
|
- Show additional options in a modal window
|
|
- Include volume sliders:
|
|
- Master volume control
|
|
- TTS volume control
|
|
- Music volume control
|
|
- Sound effects volume control
|
|
- Include manual TTS system selection
|
|
- All settings are persisted via the persistence-manager
|
|
|
|
This synchronized approach ensures that text animations and speech work together seamlessly, creating a more immersive storytelling experience while maintaining smooth performance. |