10 KiB
Prototype Text Pipeline Analysis
Overview
The prototype uses a sophisticated text processing pipeline that achieves professional typography through:
- SmartyPants (typographic punctuation)
- Hyphenopoly (hyphenation with pipe markers)
- Knuth-Plass algorithm (optimal line breaking)
- Precise character-by-character width measurement
- Justification ratio application
Complete Text Flow (Prototype)
User Text
↓
SmartyPants.smartypantsu(text, 1)
↓ [Converts quotes, dashes to typographic characters]
hyphenator_en(..., '.hyphenatePipe')
↓ [Inserts | at hyphenation points]
kap(..., measureText, measure.toReversed(), true)
↓ [Knuth-Plass line breaking]
typesetParagraph(paragraph_data, delay, measure)
↓ [DOM creation with justification]
Rendered paragraph with proper spacing
Key Components
1. Text Preprocessing (game.js:698)
var preview_data = kap(
hyphenator_en(
SmartyPants.smartypantsu(text, 1),
'.hyphenatePipe' // ← CRITICAL: Uses pipe character
),
measureText,
measure.toReversed(),
true // ← hyphenation enabled
);
Purpose: Creates nodes with accurate widths and hyphenation points
2. Character Width Measurement (game.js:380-406)
function measureText(str) {
// Special cases
if(str.substr(0, 2) == '</') { ... } // Closing tag
if(str.substr(0, 1) == '<') { ... } // Opening tag
if(str === '|') return 0; // ← Hyphen point has ZERO width
if (str === ' ') str = '\u00A0'; // Non-breaking space
// Actual measurement
ruler = rstack[rstack.length-1];
let textNode = document.createTextNode(str);
ruler.appendChild(textNode);
let width = ruler.getClientRects()[0].width; // ← REAL CSS width
ruler.removeChild(textNode);
return width;
}
Key Insight: Uses a hidden #ruler element to measure actual rendered widths with the exact font/size from CSS.
3. Measure Array (game.js:656-696)
The measure array defines different line widths:
Chapter beginning (with drop cap):
measure.push(containerWidth); // Full width
measure.push(containerWidth - indentWidth); // Indented
measure.push(containerWidth - indentWidth * 0.9); // More indented
Regular paragraph (indented first line):
measure.push(containerWidth); // Full width
measure.push(containerWidth - indentWidth * 0.5); // Half indent
Important: Array is reversed before passing to kap(): measure.toReversed()
4. Knuth-Plass Algorithm (knuth-and-plass.js:1-60)
function kap(text, measureText, measure, hyphenation) {
// Strip pipes if hyphenation disabled
if (!hyphenation) {
text = text.replace(/\|/g, '');
}
// Split on: punctuation, spaces, pipes, tags
text.split(/([.,:;!?] |\s|\||<.*?>)/u).forEach(function (fragment) {
if (fragment === ' ') {
// Create glue with stretch/shrink
nodes.push(linebreak.glue(spaceWidth, stretch, shrink));
} else if (fragment === '|') {
// Create penalty node for hyphenation point
nodes.push(linebreak.penalty(hyphenWidth * 0.25, 100, 1));
} else {
// Create box node for word
nodes.push(linebreak.box(fragmentWidth, fragment));
}
});
// Run Knuth-Plass algorithm
let breaks = linebreak(nodes, measure, { tolerance: 3, demerits });
return { nodes, breaks };
}
Node Types:
- box: Word with fixed width (cannot break)
- glue: Space with stretch/shrink (for justification)
- penalty: Potential break point (like hyphen) with cost
5. Rendering with Justification (game.js:295-378)
function typesetParagraph(paragraph_data, delay = 0, measure = []) {
// Create paragraph container
let p = document.createElement("p");
p.style.position = 'relative';
p.style.height = lineHeight * (paragraph_data.breaks.length - 1) + 'px';
// Iterate through lines
for(let i = 1; i < paragraph_data.breaks.length; i++) {
let left = 0;
let ratio = paragraph_data.breaks[i].ratio; // ← JUSTIFICATION RATIO
// Iterate through nodes on this line
for(let j = paragraph_data.breaks[i-1].position; j <= paragraph_data.breaks[i].position; j++) {
let node = paragraph_data.nodes[j];
if(node.type === 'box') {
// Handle hyphenated syllables (lines 316-320)
if(j > paragraph_data.breaks[i-1].position + 1 &&
paragraph_data.nodes[j-1].type === 'penalty' && lastChild) {
// Combine with previous syllable
syllable += '\u200c' + node.value; // Zero-width non-joiner
lastChild.innerHTML = syllable;
left += node.width;
} else {
// Create new word span
let word = document.createElement("span");
word.style.position = 'absolute';
word.style.top = lineHeight * (i - 1) * 100 / paragraph_height + '%';
word.style.left = left * 100 / line_width + '%';
word.innerHTML = node.value;
p.appendChild(word);
left += node.width;
}
}
else if(node.type === 'glue') {
// ← CRITICAL: Apply justification ratio to glue
if(ratio > 0) {
left += node.width + ratio * node.stretch;
} else {
left += node.width + ratio * node.shrink;
}
}
else if(node.type === 'penalty' && node.penalty === 100 && j === breaks[i].position) {
// Add hyphen at line break
let word = document.createElement("span");
word.innerHTML = "-";
p.appendChild(word);
}
}
}
return [p, delay];
}
Key Points:
- Justification: Glue widths are adjusted by
ratio * stretchorratio * shrink - Hyphenation: Syllables after penalty nodes are combined with previous word using zero-width non-joiner
- Positioning: All words use
position: absolutewith percentage-based coordinates
Current Implementation Issues
Issue 1: Text Processing Pipeline
Current (sentence-queue-module.js:266):
const processedText = textProcessor ? await textProcessor.process(text) : text;
Problem:
textProcessor.process()may not pass the correct selector to Hyphenopoly- Hyphenopoly needs
.hyphenatePipeselector to use pipe characters
Fix Needed:
const processedText = textProcessor ?
await textProcessor.hyphenate(
textProcessor.smartyPants(text),
'.hyphenatePipe'
) : text;
Issue 2: Hyphenation with Pipe Character
Current (text-processor-module.js:275-286):
hyphenate(text) {
if (!this.isHyphenationAvailable()) return text;
try {
return this.hyphenator(text); // ← No selector parameter
} catch (error) {
console.error("Error hyphenating text:", error);
return text;
}
}
Fix Needed: Add selector parameter
hyphenate(text, selector = null) {
if (!this.isHyphenationAvailable()) return text;
try {
return selector ?
this.hyphenator(text, selector) :
this.hyphenator(text);
} catch (error) {
console.error("Error hyphenating text:", error);
return text;
}
}
Issue 3: Knuth-Plass Not Using Pipe Characters
Current (public/js/knuth-and-plass.js):
- May not properly split on pipe characters
- May not create penalty nodes
Fix Needed: Ensure knuth-and-plass.js matches prototype implementation
Issue 4: Syllable Combination in Rendering
Current (layout-renderer-module.js):
- Does NOT combine hyphenated syllables
- Missing logic for
if(nodes[j-1].type === 'penalty' && lastChild)
Fix Needed: Add syllable combination logic when rendering box nodes after penalty nodes
Issue 5: Missing #ruler Element
Current: No #ruler element for text measurement
Fix Needed:
- Add
<div id="ruler"></div>to HTML - Use ruler for character width measurement in paragraph-layout-module.js
Implementation Plan
Phase 1: Fix Text Processing Pipeline
-
Update text-processor-module.js:
- Add
selectorparameter tohyphenate()method - Update
process()to pass.hyphenatePipeselector
- Add
-
Update sentence-queue-module.js:
- Pass
.hyphenatePipeselector when calling text processor - Ensure processedText includes pipe characters
- Pass
Phase 2: Fix Knuth-Plass Integration
-
Verify knuth-and-plass.js:
- Ensure it splits on
\|character - Creates
penaltynodes with cost 100 - Handles HTML tags properly
- Ensure it splits on
-
Update paragraph-layout-module.js:
- Ensure
measureText()returns 0 for|character - Use
#rulerelement for measurement - Handle HTML tag stack properly
- Ensure
Phase 3: Fix Rendering with Justification
-
Update layout-renderer-module.js:
- Add syllable combination logic for hyphenated words
- Apply justification ratios to glue widths correctly
- Add hyphens at line breaks when penalty node is at break position
-
Fix spacing issues:
- Create space spans with adjusted widths
- Use zero-width non-joiner for syllable combination
Phase 4: Testing & Refinement
- Test with simple text: "This is a test."
- Test with hyphenation: Long words that span lines
- Test with justification: Full paragraphs
- Test with special characters: Quotes, dashes, etc.
Success Criteria
✅ SmartyPants converts quotes correctly ✅ Hyphenopoly inserts pipe characters ✅ Knuth-Plass creates proper breaks with hyphenation ✅ Words don't overlap ✅ Words have proper spacing (not smushed) ✅ Justification works (even spacing across line width) ✅ Hyphens appear at line breaks ✅ Drop caps and indentation work