In Active Development

Every Book.
Every Voice.
Free for Everyone.

My father is visually impaired. Getting him access to books — real books, not just audiobooks that happen to exist — has always been a frustrating, expensive, and technically broken experience. So I decided to build the tool I wish existed.

A fully accessible, free, open-source web tool that converts any book — EPUB, PDF, or even a photo of a printed page — into high-quality audio, built from the ground up around the needs of screen reader users.

Launch the Tool — Free View on GitHub

accessible-books

Upload book.epub

Extracting text...

TTS Edge AI Voice

Generating audio...

chapter_01.mp3 ready

Screen reader compatible

0:42 / 3:18

The Problem

The visually impaired community faces barriers that most of us never think about.

Audiobooks are expensive

Commercial audiobooks cost $15–$30 each. Subscriptions like Audible add up fast. For a retired person on a fixed income, this is simply unaffordable.

Most books don't have audio versions

Publishers only produce audio for bestsellers. Millions of educational, religious, technical, and cultural books exist only as text — invisible to blind readers.

Existing tools are inaccessible

The irony: many free book converters are themselves unusable by blind users. Cluttered interfaces, no keyboard navigation, no screen reader support.

Arabic content is underserved

Arabic-language audiobooks are rare even in commercial markets. For Arabic-speaking visually impaired users, the options are nearly zero.

How It Works

Four stages from any book format to a ready-to-listen audio file.

Upload

Drag and drop — or use voice command — to upload an EPUB, PDF, TXT, or even a photo of a printed page. The interface is designed for keyboard-only and screen reader users first.

EPUB PDF TXT Image / Photo

Extract Text

The backend intelligently extracts clean text depending on the format. PDFs are parsed with PyMuPDF. EPUBs use EbookLib. Physical book photos go through Tesseract OCR.

EbookLib PyMuPDF Tesseract OCR

Text-to-Speech

Clean text is sent through a TTS engine. The default is Edge TTS — Microsoft's free, lifelike AI voices — with fallback support for Google Cloud TTS (free tier) and offline Coqui TTS.

Edge TTS Google TTS Coqui TTS

Listen or Download

The audio is streamed in-browser or downloaded as MP3 chapters. An aria-live announcement tells screen readers the moment the audio is ready — no manual refresh needed.

MP3 Output Chapter Split aria-live

Built for Accessibility First

Not an afterthought. Every design decision starts with the question: can my father use this on his own?

Technical Stack

Entirely open-source, entirely free to run.

Frontend

Semantic HTML5 ARIA landmarks, live regions, focus management

CSS3 High-contrast mode, visible focus states, responsive

Vanilla JavaScript Web Speech API, drag-and-drop, no heavy frameworks

Backend

Python + FastAPI REST API for file upload, extraction, TTS conversion

PyMuPDF + EbookLib Text extraction from PDF and EPUB formats

Tesseract OCR Optical character recognition for physical book photos

Text-to-Speech

Edge TTS Free, lifelike AI voices via Microsoft Edge — primary engine

Google Cloud TTS Free tier fallback — up to 4M characters/month

Coqui TTS Offline, self-hosted model — runs free forever, no API needed

My father loves reading. He always has. Watching that ability shrink as his vision did — and knowing that the barrier wasn't the words themselves but just the format they came in — made me want to fix it.

This isn't a side project to me. It's a tool I'm building for a real person, with a real need, right now. That changes how I approach every design decision — because I know exactly who I'm designing for, and he'll tell me directly if I got it wrong.

Roadmap

Where the project is now, and where it's going.

Architecture & Tech Research

Evaluated all TTS engines, chose Edge TTS as default, mapped the full processing pipeline.

Accessibility Design Spec

Defined the ARIA structure, keyboard navigation map, and screen reader announcement strategy.

Backend API — In Progress

Building the Python + FastAPI server: file upload endpoint, EPUB/PDF extraction, Edge TTS integration.

Accessible Frontend

Semantic HTML interface with full keyboard navigation, ARIA live regions, and voice command support.

OCR for Physical Books

Tesseract integration so users can photograph printed pages and convert them to audio.

Arabic Voice & RTL Support

Full Arabic TTS voices and right-to-left text handling — the original motivation for the project.

Public Launch & Community Feedback

Open beta with visually impaired users, accessibility audit with NVDA / JAWS / VoiceOver testers.

Try It Now — Free for Everyone

No account. No sign-up. No server. Upload any EPUB, PDF, or TXT and listen instantly in your browser. Or reach out to collaborate.

Launch the Reader Get in Touch GitHub

Every Book. Every Voice. Free for Everyone.

The Problem

Audiobooks are expensive

Most books don't have audio versions

Existing tools are inaccessible

Arabic content is underserved

How It Works

Upload

Extract Text

Text-to-Speech

Listen or Download

Built for Accessibility First

Semantic HTML

WAI-ARIA

Full Keyboard Navigation

Voice Commands

Minimalist Interface

Arabic & RTL Support

Technical Stack

Frontend

Backend

Text-to-Speech

Roadmap

Architecture & Tech Research

Accessibility Design Spec

Backend API — In Progress

Accessible Frontend

OCR for Physical Books

Arabic Voice & RTL Support

Public Launch & Community Feedback

Try It Now — Free for Everyone

Every Book.
Every Voice.
Free for Everyone.