# Prototype Text Pipeline Analysis ## Overview The prototype uses a sophisticated text processing pipeline that achieves professional typography through: 1. SmartyPants (typographic punctuation) 2. Hyphenopoly (hyphenation with pipe markers) 3. Knuth-Plass algorithm (optimal line breaking) 4. Precise character-by-character width measurement 5. Justification ratio application ## Complete Text Flow (Prototype) ``` User Text ↓ SmartyPants.smartypantsu(text, 1) ↓ [Converts quotes, dashes to typographic characters] hyphenator_en(..., '.hyphenatePipe') ↓ [Inserts | at hyphenation points] kap(..., measureText, measure.toReversed(), true) ↓ [Knuth-Plass line breaking] typesetParagraph(paragraph_data, delay, measure) ↓ [DOM creation with justification] Rendered paragraph with proper spacing ``` ## Key Components ### 1. Text Preprocessing (game.js:698) ```javascript var preview_data = kap( hyphenator_en( SmartyPants.smartypantsu(text, 1), '.hyphenatePipe' // ← CRITICAL: Uses pipe character ), measureText, measure.toReversed(), true // ← hyphenation enabled ); ``` **Purpose**: Creates nodes with accurate widths and hyphenation points ### 2. Character Width Measurement (game.js:380-406) ```javascript function measureText(str) { // Special cases if(str.substr(0, 2) == '') { ... } // Closing tag if(str.substr(0, 1) == '<') { ... } // Opening tag if(str === '|') return 0; // ← Hyphen point has ZERO width if (str === ' ') str = '\u00A0'; // Non-breaking space // Actual measurement ruler = rstack[rstack.length-1]; let textNode = document.createTextNode(str); ruler.appendChild(textNode); let width = ruler.getClientRects()[0].width; // ← REAL CSS width ruler.removeChild(textNode); return width; } ``` **Key Insight**: Uses a hidden `#ruler` element to measure actual rendered widths with the exact font/size from CSS. ### 3. Measure Array (game.js:656-696) The measure array defines different line widths: **Chapter beginning (with drop cap)**: ```javascript measure.push(containerWidth); // Full width measure.push(containerWidth - indentWidth); // Indented measure.push(containerWidth - indentWidth * 0.9); // More indented ``` **Regular paragraph (indented first line)**: ```javascript measure.push(containerWidth); // Full width measure.push(containerWidth - indentWidth * 0.5); // Half indent ``` **Important**: Array is reversed before passing to `kap()`: `measure.toReversed()` ### 4. Knuth-Plass Algorithm (knuth-and-plass.js:1-60) ```javascript function kap(text, measureText, measure, hyphenation) { // Strip pipes if hyphenation disabled if (!hyphenation) { text = text.replace(/\|/g, ''); } // Split on: punctuation, spaces, pipes, tags text.split(/([.,:;!?] |\s|\||<.*?>)/u).forEach(function (fragment) { if (fragment === ' ') { // Create glue with stretch/shrink nodes.push(linebreak.glue(spaceWidth, stretch, shrink)); } else if (fragment === '|') { // Create penalty node for hyphenation point nodes.push(linebreak.penalty(hyphenWidth * 0.25, 100, 1)); } else { // Create box node for word nodes.push(linebreak.box(fragmentWidth, fragment)); } }); // Run Knuth-Plass algorithm let breaks = linebreak(nodes, measure, { tolerance: 3, demerits }); return { nodes, breaks }; } ``` **Node Types**: - **box**: Word with fixed width (cannot break) - **glue**: Space with stretch/shrink (for justification) - **penalty**: Potential break point (like hyphen) with cost ### 5. Rendering with Justification (game.js:295-378) ```javascript function typesetParagraph(paragraph_data, delay = 0, measure = []) { // Create paragraph container let p = document.createElement("p"); p.style.position = 'relative'; p.style.height = lineHeight * (paragraph_data.breaks.length - 1) + 'px'; // Iterate through lines for(let i = 1; i < paragraph_data.breaks.length; i++) { let left = 0; let ratio = paragraph_data.breaks[i].ratio; // ← JUSTIFICATION RATIO // Iterate through nodes on this line for(let j = paragraph_data.breaks[i-1].position; j <= paragraph_data.breaks[i].position; j++) { let node = paragraph_data.nodes[j]; if(node.type === 'box') { // Handle hyphenated syllables (lines 316-320) if(j > paragraph_data.breaks[i-1].position + 1 && paragraph_data.nodes[j-1].type === 'penalty' && lastChild) { // Combine with previous syllable syllable += '\u200c' + node.value; // Zero-width non-joiner lastChild.innerHTML = syllable; left += node.width; } else { // Create new word span let word = document.createElement("span"); word.style.position = 'absolute'; word.style.top = lineHeight * (i - 1) * 100 / paragraph_height + '%'; word.style.left = left * 100 / line_width + '%'; word.innerHTML = node.value; p.appendChild(word); left += node.width; } } else if(node.type === 'glue') { // ← CRITICAL: Apply justification ratio to glue if(ratio > 0) { left += node.width + ratio * node.stretch; } else { left += node.width + ratio * node.shrink; } } else if(node.type === 'penalty' && node.penalty === 100 && j === breaks[i].position) { // Add hyphen at line break let word = document.createElement("span"); word.innerHTML = "-"; p.appendChild(word); } } } return [p, delay]; } ``` **Key Points**: 1. **Justification**: Glue widths are adjusted by `ratio * stretch` or `ratio * shrink` 2. **Hyphenation**: Syllables after penalty nodes are combined with previous word using zero-width non-joiner 3. **Positioning**: All words use `position: absolute` with percentage-based coordinates ## Current Implementation Issues ### Issue 1: Text Processing Pipeline **Current** (sentence-queue-module.js:266): ```javascript const processedText = textProcessor ? await textProcessor.process(text) : text; ``` **Problem**: - `textProcessor.process()` may not pass the correct selector to Hyphenopoly - Hyphenopoly needs `.hyphenatePipe` selector to use pipe characters **Fix Needed**: ```javascript const processedText = textProcessor ? await textProcessor.hyphenate( textProcessor.smartyPants(text), '.hyphenatePipe' ) : text; ``` ### Issue 2: Hyphenation with Pipe Character **Current** (text-processor-module.js:275-286): ```javascript hyphenate(text) { if (!this.isHyphenationAvailable()) return text; try { return this.hyphenator(text); // ← No selector parameter } catch (error) { console.error("Error hyphenating text:", error); return text; } } ``` **Fix Needed**: Add selector parameter ```javascript hyphenate(text, selector = null) { if (!this.isHyphenationAvailable()) return text; try { return selector ? this.hyphenator(text, selector) : this.hyphenator(text); } catch (error) { console.error("Error hyphenating text:", error); return text; } } ``` ### Issue 3: Knuth-Plass Not Using Pipe Characters **Current** (public/js/knuth-and-plass.js): - May not properly split on pipe characters - May not create penalty nodes **Fix Needed**: Ensure knuth-and-plass.js matches prototype implementation ### Issue 4: Syllable Combination in Rendering **Current** (layout-renderer-module.js): - Does NOT combine hyphenated syllables - Missing logic for `if(nodes[j-1].type === 'penalty' && lastChild)` **Fix Needed**: Add syllable combination logic when rendering box nodes after penalty nodes ### Issue 5: Missing #ruler Element **Current**: No `#ruler` element for text measurement **Fix Needed**: 1. Add `
` to HTML 2. Use ruler for character width measurement in paragraph-layout-module.js ## Implementation Plan ### Phase 1: Fix Text Processing Pipeline 1. **Update text-processor-module.js**: - Add `selector` parameter to `hyphenate()` method - Update `process()` to pass `.hyphenatePipe` selector 2. **Update sentence-queue-module.js**: - Pass `.hyphenatePipe` selector when calling text processor - Ensure processedText includes pipe characters ### Phase 2: Fix Knuth-Plass Integration 1. **Verify knuth-and-plass.js**: - Ensure it splits on `\|` character - Creates `penalty` nodes with cost 100 - Handles HTML tags properly 2. **Update paragraph-layout-module.js**: - Ensure `measureText()` returns 0 for `|` character - Use `#ruler` element for measurement - Handle HTML tag stack properly ### Phase 3: Fix Rendering with Justification 1. **Update layout-renderer-module.js**: - Add syllable combination logic for hyphenated words - Apply justification ratios to glue widths correctly - Add hyphens at line breaks when penalty node is at break position 2. **Fix spacing issues**: - Create space spans with adjusted widths - Use zero-width non-joiner for syllable combination ### Phase 4: Testing & Refinement 1. **Test with simple text**: "This is a test." 2. **Test with hyphenation**: Long words that span lines 3. **Test with justification**: Full paragraphs 4. **Test with special characters**: Quotes, dashes, etc. ## Success Criteria ✅ SmartyPants converts quotes correctly ✅ Hyphenopoly inserts pipe characters ✅ Knuth-Plass creates proper breaks with hyphenation ✅ Words don't overlap ✅ Words have proper spacing (not smushed) ✅ Justification works (even spacing across line width) ✅ Hyphens appear at line breaks ✅ Drop caps and indentation work