Fotheidil System Architecture
A high-level map of how the Fotheidil subtitling/transcription product fits together — the frontend, the media-processing API, the recognition pipeline, the database, and how a request flows end to end.
Components
| Component | Repo | Runs on | Port | Role |
|---|---|---|---|---|
| Frontend | fotheidil | services VM 10.0.0.2 (host-mode container) | 3003 | Next.js 14 (App Router) UI. Uploads media, renders progress, transcript editor. |
| API | fotheidil-api | fotheidil VM 10.0.0.3 (container) | 4040 | Express. Receives uploads, runs ffmpeg (extract audio / compress video), calls recognition, writes state to Supabase. |
| Recognition frontend | fotheidil-transcribe | recognition VM 10.0.0.8 (systemd, bare metal) | 6060 | FastAPI. Preprocessing + entry point; tunnels to Banba for the GPU pipeline. |
| Recognition GPU pipeline | — | Banba phoneticsrv3.lcs.tcd.ie (134.226.98.116) | 8000 | NeMo ASR + Pyannote diarization + MarianMT capitalisation/punctuation. |
| Database & Auth | (managed) | Supabase Cloud pdntukcptgktuzpynlsv.supabase.co | — | Postgres (fot_video_uploads), Auth, Realtime, Storage. |
| Central Auth (SSO) | auth (auth-system) | services VM 10.0.0.2 (static SPA via NGINX) | — | auth.abair.ie. ABAIR's shared Vite/React SSO. Logs the user in against Supabase Auth and redirects back with the access/refresh tokens the frontend forwards to the API. |
Network flow
Request lifecycle (upload → transcript)
- Upload — browser
POSTs tofotapi.abair.ie/upload(multipart:file,fileName,userId, plus the SupabaseaccessToken/refreshTokenfromauth.abair.ie). multer writes the raw bytes touploads/under a random hash name. - Persist — the API sets the Supabase session, rejects duplicates, inserts a
fot_video_uploadsrow, then renames the upload touploads/{id}-{name}. It responds200immediately and does the rest asynchronously. - Audio extraction — ffmpeg reads
uploads/{id}-{name}and writesprocessed-wav/{id}-{name}.wav(16 kHz mono; plus a.webminprocessed-webm/for audio-only uploads). A live progress log intmp/is polled every ~3 s to updateaudio_extraction_progress. - Recognition — the API sends
processed-wav/{id}-{name}.wavtohttp://10.0.0.8:6060/generate_transcripts/(tunnelled to Banba's NeMo/Pyannote/MarianMT pipeline, which holds its own short-lived temp copy). The diarized, punctuated transcript is stored inASR_output; the.wavstays on disk. - Video compression (video only) — ffmpeg compresses
uploads/{id}-{name}intoprocessed-webm/{id}-{name}.webmfor playback; itstmp/progress log feedsvideo_compression_progressand is deleted on completion. - Result delivery — the browser never re-polls; a Supabase Realtime subscription pushes each
progress field and the final transcript live. The compressed
.webmis streamed on demand fromfotapi.abair.ie/videos/{id}-{name}.webm.
Files in all four directories persist after processing until POST /upload/delete removes them
(and kills any still-running ffmpeg PIDs).
Data store
Single source of truth is the fot_video_uploads table in Supabase Cloud. Key columns:
- Identity:
id,user_id,name,original_filetype - Progress / lifecycle:
upload_state,audio_extraction_progress,video_compression_progress,recognition_progress,*_start/*_endtimestamps,media_length - Process control:
ffmpeg_extract_audio_process_id,ffmpeg_compress_video_process_id(used by cancellation) - Output:
ASR_output(raw transcript),edited_ASR_output(editable copy, seeded fromASR_output),transcript_percentages,permission_given(consent to reuse data for ABAIR ASR/TTS).
Row-Level Security scopes rows to the authenticated user_id, which is why every API operation
re-establishes the Supabase session from the forwarded tokens.
Media file storage
Supabase holds only metadata and transcripts; the media files themselves live on disk on the
fotheidil VM (10.0.0.3). A dedicated ext4 disk (/dev/sdb1, ~503 GB) is mounted at
/mnt/fotheidil-data and bind-mounted into the API container at /app/src/data (the paths in
src/config/paths.js). It contains four working directories:
| Directory | Contents | Maps to (paths.js) |
|---|---|---|
uploads/ | Raw uploaded files as received by multer (random hash names, then renamed to {id}-{name}). | uploadMediaPath |
tmp/ | ffmpeg progress logs ({id}-{name}.audio_extraction.log, .video_compression.log) polled to update progress. | progressPath |
processed-wav/ | Extracted 16 kHz mono WAVs ({id}-{name}.wav); the file sent to recognition. | processedWavPath |
processed-webm/ | Compressed WebM ({id}-{name}.webm); served back for in-browser playback. | processedWebmPath |
Because storage is a host bind mount, these files persist across container restarts and
redeploys. Cleanup is explicit: POST /upload/delete removes a row's files from all four
directories (see the cancellation note above).