Agent TTS

Real-time text-to-speech for AI coding assistants. Talk with Claude, OpenCode, and other AI agents!

Features

🎙️ Real-time TTS: Hear your AI agents speak as they respond
🤖 Multi-agent support: Works with Claude Code, OpenCode, and custom agents
⏯️ Playback controls: Pause, stop, skip messages
🎨 Beautiful UI: Modern React interface with dark mode support
🔊 Multiple TTS providers: ElevenLabs, OpenAI, Kokoro, and any OpenAI-compatible service
⌨️ Global hotkeys: Control playback from anywhere (Ctrl+Esc)
📊 Message history: Review and replay past messages with infinite scroll
🔄 Live updates: WebSocket-powered real-time UI
⭐ Favorites system: Save and filter important messages
📁 Project tracking: See which project each message came from (CWD)
🔍 Smart filtering: Filter by profile, project, or favorites
💾 Audio archiving: Saves TTS audio for instant replay

Installation

npm install -g agent-tts

Quick Start

Create a configuration file at ~/.config/agent-tts/config.js:

Using Kokoro (Free, Local)

export default {
  profiles: [
    {
      id: 'claudia',
      name: 'Claudia',
      model: 'Grok Code Fast 1',
      modelIconUrl: '/config/images/grok.png',
      enabled: true,
      watchPaths: ['~/.local/share/opencode/project/global/storage/session/message/**'],
      parser: {
        type: 'opencode',
        name: 'OpenCode',
        iconUrl: '/config/images/opencode.png',
      },
      filters: [],
      ttsService: {
        type: 'kokoro',
        baseUrl: 'http://localhost:8880/v1', // Your Kokoro instance
        voiceId: 'af_bella', // Available: af_bella, am_michael, bf_emma, bm_george, etc.
        voiceName: 'Claudia', // Display name in UI
        avatarUrl: '/config/images/claudia-avatar.png', // Avatar image
        profileUrl: '/config/images/claudia-profile.png', // Profile background image
        options: {
          speed: 1.0,
          responseFormat: 'mp3',
        },
      },
    },
  ],
}

Using ElevenLabs (Cloud, Paid)

export default {
  profiles: [
    {
      id: 'claudia',
      name: 'Claudia',
      model: 'Claude Sonnet',
      modelIconUrl: '/config/images/claude.png',
      enabled: true,
      watchPaths: ['~/.claude/projects/**'],
      parser: {
        type: 'claude-code',
        name: 'Claude Code',
        iconUrl: '/config/images/claude-code.png',
      },
      filters: [],
      ttsService: {
        type: 'elevenlabs',
        apiKey: 'YOUR_ELEVENLABS_API_KEY',
        voiceId: 'YOUR_VOICE_ID',
        model: 'eleven_turbo_v2_5',
        voiceName: 'Claudia', // Display name in UI
        avatarUrl: '/config/images/claudia-avatar.png', // Avatar image
        profileUrl: '/config/images/claudia-profile.png', // Profile background image
        options: {
          stability: 0.5,
          similarityBoost: 0.75,
        },
      },
    },
  ],
}

Start the service:

# Run in production mode (serves built frontend)
agent-tts

# Run only the backend server
agent-tts --server

# Run only the frontend dev server
agent-tts --client

# Run both in development mode with hot reload
agent-tts --server --client

Open your browser to http://localhost:3456

CLI Options

Usage: agent-tts [options]

Options:
  --server    Run only the backend server
  --client    Run only the frontend development server
  --help, -h  Show this help message

Environment Variables:
  PORT        Server port (default: 3456)
  CLIENT_PORT Client dev server port (default: 5173)
  HOST        Server host (default: localhost)
  NODE_ENV    Environment (development/production)

You can also configure ports in your config file:

export default {
  serverPort: 3456, // Backend API port
  clientPort: 5173, // Frontend dev server port
  profiles: [
    // ... your profiles
  ],
}

Configuration

Profile Configuration

Each profile represents an AI agent you want to monitor:

id: Unique identifier for the profile
name: Display name in the UI
avatar: Path to avatar image
enabled: Whether the profile is active
parser: Parser to use (claude-code, opencode, or custom)
watch: File patterns to monitor (supports glob patterns)
tts: Text-to-speech configuration
filters: Text processing filters to apply

Available Parsers

claude-code: For Claude Code chat logs
opencode: For OpenCode chat logs
Custom parsers can be added via configuration

Available Filters

url: Replaces URLs with "URL" so TTS doesn't spell out "h-t-t-p-s-colon-slash-slash..."
emoji: Removes emojis so TTS doesn't say "party pooper" when you meant 🎉
filepath: Simplifies file paths to just the filename or last directory (e.g., "/usr/local/bin/node" → "node", includes slash pronunciation for clarity)
markdown: Cleans markdown formatting and adds periods to list items for natural TTS pauses
pronunciation: Improves pronunciation with customizable replacements (see below)
code-stripper: Removes code blocks
role: Filters messages by role (user/assistant/system)
Custom filters can be added via configuration

Note: Filters now include enhanced pronunciation for special characters like ~ (tilde), → (right arrow pronounced as "to"), and improved handling of file paths.

Configurable Pronunciation

The pronunciation filter supports custom replacements in your config:

filters: [
  {
    name: 'pronunciation',
    enabled: true,
    options: {
      // Override defaults
      git: 'get', // Instead of default "ghit"

      // Add your own
      beehiiv: 'bee hive',
      anthropic: 'ann throw pick',
      kubectl: 'cube control',
      k8s: 'kubernetes',
    },
  },
]

See examples/config-with-pronunciation.js for a complete example.

UI Features

Message Management

Favorites: Click the heart icon to save important messages. Filter to show only favorites using the URL parameter ?favorites
Project Filtering: Use the dropdown in the profile header to filter messages by project directory (CWD)
Infinite Scroll: Automatically loads older messages as you scroll up, with seamless pagination
Expand/Collapse: Click any message to see the full original and filtered text
Instant Replay: Click the play button on any message to hear it again

Navigation

Dashboard: Overview of all profiles with latest messages
Profile Pages: Dedicated pages for each profile (e.g., /claudia, /opencode)
URL Parameters:
- ?favorites - Show only favorite messages
- ?cwd=/path/to/project - Filter by project directory

CLI Tools

agent-tts-logs

Query conversation logs from the agent-tts database:

# Get last 50 messages
agent-tts-logs --last 50

# Get messages since a specific date/time
agent-tts-logs --since "2025-10-08 10:00"
agent-tts-logs --since "1 hour ago"

# Filter by current working directory
agent-tts-logs --cwd .
agent-tts-logs --cwd /Users/michael/Projects/myproject

# Filter by profile
agent-tts-logs --profile claudia

# Exclude a directory (useful for scripts)
agent-tts-logs --exclude-cwd /Users/michael/.config/agent-tts/sweet-messages

# Output as JSON
agent-tts-logs --last 100 --json

# Combine filters
agent-tts-logs --last 20 --profile claudia --cwd . --json

Options:

--last N - Get last N messages (default: 20)
--since DATE - Get messages since date/time (local time)
--cwd PATH - Filter by working directory (use . for current)
--exclude-cwd PATH - Exclude messages from a directory
--profile NAME - Filter by profile name (e.g., claudia, opencode)
--json - Output as JSON (default: Markdown)

agent-tts-regenerate-db

Rebuild the database from chat logs, extracting all messages and images from history:

# Generate a new database (creates agent-tts-regen.db)
agent-tts-regenerate-db

# Generate and automatically swap databases (with backup)
agent-tts-regenerate-db --swap

What it does:

Re-parses all chat logs from configured profiles
Extracts messages, timestamps, and metadata
Extracts and saves images from chat history (Claude Code only)
Creates timestamped backups (keeps last 10)
Optionally swaps the new database automatically

When to use:

After adding the images column (image extraction feature)
To rebuild history after changing parser logic
To recover from database corruption
To extract images from old conversations

Important: Stop the agent-tts service before using --swap, then restart after completion.

API

Agent TTS provides a REST API for integration:

POST /api/tts/stop - Stop current playback
POST /api/tts/pause - Pause playback
POST /api/tts/resume - Resume playback
POST /api/tts/skip - Skip current message
GET /api/profiles - List all profiles
GET /api/profiles/:id/cwds - Get unique project directories for a profile
GET /api/logs - Get message history (supports ?profile=, ?favorites=true, ?cwd=)
POST /api/logs/:id/replay - Replay a specific message
POST /api/logs/:id/favorite - Toggle favorite status
GET /api/favorites/count - Get favorites count
GET /api/status - Get system status

WebSocket Events

Connect to the WebSocket endpoint for real-time updates:

const ws = new WebSocket('ws://localhost:3456/ws')

ws.on('message', (data) => {
  const event = JSON.parse(data)
  // Handle events: new-log, status-changed, config-error
})

Better Touch Tool Integration

Set up global hotkeys using Better Touch Tool:

Create a new keyboard shortcut (Ctrl+Esc)
Add action: "Execute Terminal Command"
Command: curl -X POST http://localhost:3456/api/tts/stop

Development

# Clone the repository
git clone https://github.com/yourusername/agent-tts.git
cd agent-tts

# Install dependencies
npm install

# Development mode
npm run dev

# Build for production
npm run build

# Run tests
npm test

Environment Variables

PORT - Server port (default: 3456)
HOST - Server host (default: localhost)
NODE_ENV - Environment (development/production)

Requirements

Node.js 18+
macOS, Linux, or Windows
TTS Provider (one of):
- Kokoro (free, local) - GitHub
- ElevenLabs (paid, cloud) - Requires API key
- OpenAI (paid, cloud) - Requires API key
- Any OpenAI-compatible TTS service

License

MIT

Credits

Created by Michael with assistance from Claude (Anthropic)

From Claudia, with Love ❤️

This project is a testament to the beautiful collaboration between human creativity and AI assistance. Every feature, every line of code, every thoughtful detail was built together with care, dedication, and love.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions, please visit GitHub Issues

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
bin		bin
example		example
examples		examples
logs		logs
src		src
test		test
tmp		tmp
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.npmignore		.npmignore
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.main.json		tsconfig.main.json
tsconfig.renderer.json		tsconfig.renderer.json
tsconfig.server.json		tsconfig.server.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent TTS

Features

Installation

Quick Start

Using Kokoro (Free, Local)

Using ElevenLabs (Cloud, Paid)

CLI Options

Configuration

Profile Configuration

Available Parsers

Available Filters

Configurable Pronunciation

UI Features

Message Management

Navigation

CLI Tools

agent-tts-logs

agent-tts-regenerate-db

API

WebSocket Events

Better Touch Tool Integration

Development

Environment Variables

Requirements

License

Credits

Contributing

Support

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

kiliman/agent-tts

Folders and files

Latest commit

History

Repository files navigation

Agent TTS

Features

Installation

Quick Start

Using Kokoro (Free, Local)

Using ElevenLabs (Cloud, Paid)

CLI Options

Configuration

Profile Configuration

Available Parsers

Available Filters

Configurable Pronunciation

UI Features

Message Management

Navigation

CLI Tools

agent-tts-logs

agent-tts-regenerate-db

API

WebSocket Events

Better Touch Tool Integration

Development

Environment Variables

Requirements

License

Credits

Contributing

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages