A VS Code extension that provides Text-to-Speech (TTS) and Speech-to-Text (STT) functionality for Cline AI, using the speaches-ai/speaches API server.
- Text to Speech: Convert selected text or clipboard text to speech
- Speech to Text: Transcribe audio files to text
- File Output: Save TTS output as audio files
- Context Menu Integration: Right-click context menu options
- Editor Integration: Direct text insertion for STT results
-
Cline: Text to Speech (
cline-speech.tts)- Converts selected text or clipboard text to speech
- Plays the audio directly in your default media player
-
Cline: Speech to Text (
cline-speech.stt)- Transcribes audio files to text
- Inserts the transcribed text at cursor position
-
Cline: Text to Speech with File (
cline-speech.ttsWithFile)- Converts text to speech and saves as audio file
- Prompts for filename
-
Cline: Speech to Text from File (
cline-speech.sttFromFile)- Transcribes audio files to text and inserts result
- Prompts for audio file selection
-
Cline: Voice to Text (Record) (
cline-speech.voiceToText)- Records voice from microphone and transcribes to text
- Note: Microphone access requires proper permissions and may have platform limitations
- Inserts the transcribed text at cursor position
-
Install the speaches-ai/speaches server:
git clone https://github.com/speaches-ai/speaches.git cd speaches docker-compose up -d -
Make sure the server is running on
http://speaches.lan:8000(default)
The speaches-ai/speaches server is a Gradio web application and may not expose direct REST API endpoints that this extension expects. If you encounter "404 Not Found" errors, please verify that:
- The server is properly running
- You're using the correct version of the speaches server that supports the required API endpoints
- The server is configured to expose the necessary TTS/STT endpoints
-
Install the extension in VS Code:
- Download the
.vsixfile or build from source - In VS Code:
Extensions→Install from VSIX→ select the file
- Download the
-
Restart VS Code
The extension can be configured through VS Code settings:
- Open VS Code Settings (Ctrl+,)
- Search for "cline speech"
- Set the
Cline Speech: Api Endpointto your speaches server address
Default endpoint: http://speaches.lan:8000
The extension supports optional task completion audio alerts:
- Enable the "Cline Speech: Task Completion Alert" setting
- When enabled, the extension will play an audio notification saying "Task Completed" after successful operations
- This provides audible feedback when tasks are completed by Cline
- Select text in your editor or copy text to clipboard
- Use one of the commands from the Command Palette (
Ctrl+Shift+P) or context menu - For STT commands, select an audio file when prompted
The extension communicates with the speaches server using these endpoints:
POST /v1/audio/speech- Text to Speech conversion (with proper JSON payload)POST /v1/audio/transcriptions- Speech to Text conversion
The speaches server expects a specific JSON format for TTS requests:
{
"input": "Hello World!",
"model": "tts-1",
"voice": "alloy",
"response_format": "wav",
"speed": 1.0
}For STT, the extension sends base64-encoded audio data to the /v1/audio/transcriptions endpoint.
Contributions are welcome! Please fork the repository and submit pull requests.
MIT License