Architecture
Layers at a glance
| Layer | Files | Role |
|---|---|---|
| Content scripts | content/, shared/ (chunker, i18n) | Extract, chunk, play, highlight — runs in the host page |
| TTS fetch | background/ (extension) or shared/tts-client.js (widget) | Talks to the TTS API; prefetches upcoming chunks |
| Embed server | embed/ | Serves the widget bundle at webreader.abair.ie |
| i18n | locales/resources.json + shared/i18n.js | Translation strings — source of truth in resources.json, managed via the Translations Editor |
| Manifests | manifest.json / manifest-firefox.json | Chrome MV3 + Firefox MV2 (extension only) |
The extension and embed widget share the entire content-script layer. Only the TTS fetch path and the host glue differ.
Reading pipeline
When Read this page fires:
- Pick a container — first match wins:
article → main → [role="main"] → .post-content → … → body. Anything outside is silent. - Collect candidates — one
querySelectorAll('p, h1–h6, li, td, th, blockquote, button, a, figcaption')(plusimg[alt]if Announce media is on). - Filter — drop hidden, in-panel, in-Shadow-DOM, and elements with
textContent.trim().length <= 5. - Chunk each candidate by sentence (default) or paragraph; the chunker handles abbreviations and inline
<a>annotations. - Pause between chunks — headings 600 ms, block prose 250 ms, other 0.
- Prefetch + play — TTS the next two while the current plays; highlighter syncs words to DOM via XPath.
Things integrators must know
- Reading order = DOM order. CSS reordering (flex
order, grid placement, absolute positioning) is invisible. - Nested
<a>is deduplicated. Card patterns (<a>wrapping<h2>+<p>) collapse to one chunk. Inline links in prose inject"Link: "into the parent's audio instead of duplicating. - Still duplicates:
<button>in<p>/<li>, nested<li>, and bare-link containers (<h2><a>X</a></h2>). - Form fields are silent.
<input>,<select>,<textarea>,<label>aren't in the selector. Describe forms in a sibling<p>if listeners need them. aria-hidden="true"is ignored. Onlydisplay/visibilityfilter elements.- 5-character minimum. Single letters and short labels (
"OK","Go") are silently dropped.
Drilling in
| Want to understand… | Read |
|---|---|
| Smart link handling | shared/chunker.js → buildInlineAnnotatedText |
| Pipeline orchestration | content/content.js → getTextNodes, playCurrentChunk |
| Word-level highlight sync | content/highlighter.js → startWordSync |
| Extension boot vs widget boot | background/service-worker.js vs embed/src/boot.js |
| Message API between content and BG | synthesize, prefetch, cancelPrefetch, audioData, audioError |