Skip to main content

Dictation API

Overview

This project provides an automatic speech recognition (ASR) system specifically designed for the Irish language. Built on the Vosk speech recognition toolkit, it enables real-time transcription of spoken Irish through a web interface. The system utilizes WebRTC technology to capture audio from users' browsers and process it on a server running a Kaldi-based speech recognition model trained for Irish.

Deployment

Location

The currently running dictation vosk server is in /home/mrozo/dictation (23/07/2025)

deploy_dictation_server.sh

There is a script in the project folder that automates updating the server in production. It does the follwing:

  1. Pulls latest changes
  2. Builds a new docker image
  3. Deploys the new docker iamge if the build is successful
  4. Logs the initial server outputs to verify if it is running.

In the future, this should be replaced with the ABAIR CI/CD Pipeline

#!/bin/bash

set -e # exit on error

echo "Pulling latest changes from Git..."
if ! git pull; then
echo "Git pull failed, aborting."
exit 1
fi
echo "Git pull successful."

echo "Building Docker image: dictation-asr-server"
sudo docker buildx build -t dictation-asr-server .

echo "Starting containers with docker compose"
sudo docker compose up -d

echo "Finding container ID for dictation-asr-server"
# adjust this filter if your container name differs
CONTAINER_ID=$(sudo docker ps -qf "ancestor=dictation-asr-server")

if [ -z "$CONTAINER_ID" ]; then
echo "Could not find a running container for dictation-asr-server"
exit 1
fi

echo "Tailing logs for container: $CONTAINER_ID"
sudo docker logs -f "$CONTAINER_ID"

Dockerfile

The project contains the following Dockerfile to generate an iamge

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /usr/src/app

# Install libatomic1 and other dependencies
RUN apt-get update && apt-get install -y \
libatomic1 \
&& rm -rf /var/lib/apt/lists/*

# Copy the current directory contents into the container at /usr/src/app
COPY . .

# Install any needed packages specified in requirements.txt
# Note: Make sure you have a requirements.txt file with all the dependencies.
RUN pip install --no-cache-dir -r requirements.txt

# Make port 2700 available to the world outside this container
EXPOSE 2700

# Define environment variable
ENV VOSK_MODEL_PATH=/usr/src/app/model

# Copy the static directory
COPY static /usr/src/app/static

# Run asr_server.py when the container launches
CMD ["python", "./asr_server.py"]

docker-compose.yaml

As of writing this documentation, the project is using a docker compose on the server which will reload the latest image.

version: '3.8'

services:
asr-server:
# Build the image from the current directory (requires a Dockerfile here)
build: .
container_name: asr-server
hostname: asr-server
# Expose the ASR server port (adjust if you use a different port)
ports:
- "2700:2700"
environment:
# Bind to all interfaces inside the container
VOSK_SERVER_INTERFACE: "0.0.0.0"
# The port your ASR server listens on (default 2700)
VOSK_SERVER_PORT: "2700"
# Path to your Vosk model directory inside the container
VOSK_MODEL_PATH: "/usr/src/app/model"
# Optional: where to dump raw audio (omit or set to empty to disable)
VOSK_DUMP_FILE: "/app/dump/audio.raw"
volumes:
# Mount your local model directory (read-only)
- ./model:/app/model:ro
# Optional: mount a host directory for audio dumps
- ./dump:/app/dump:rw
restart: unless-stopped

Notes