Whisper
- Developer: openai
- URL: Whisper
- Description: Whisper is a robust speech recognition project by OpenAI, leveraging large-scale weak supervision. It is designed to provide high-quality transcription capabilities across various languages and domains, significantly advancing the field of automatic speech recognition (ASR).
NeMo by NVIDIA
- Developer: NVIDIA
- URL: NeMo
- Description: NeMo is a toolkit for creating AI models, including text-to-speech (TTS), automatic speech recognition (ASR), and natural language processing (NLP). It offers pre-trained models and supports multi-GPU training.
Coqui TTS
- Developer: Coqui AI
- URL: Coqui TTS
- Description: An open-source, deep learning toolkit for text-to-speech synthesis, supporting various TTS models and vocoders for creating natural-sounding speech.
Piper
- Developer: Rhasspy
- URL: Piper
- Description: A fast, local neural text-to-speech system designed to be lightweight and efficient for edge devices.
Lobe Chat
- Developer: LobeHub
- URL: Lobe Chat
- Description: Lobe Chat is an open-source, high-performance AI chat framework that supports TTS and STT, enabling the deployment of private ChatGPT-like applications.
Willow Inference Server
- Developer: ToveraInc
- URL: Willow Inference Server
- Description: An optimized language inference server supporting ASR/STT, TTS, and LLM across various protocols, focused on privacy and local deployment.
Eden AI
- Developer: Eden AI
- URL: Eden AI
- Description: Provides access to multiple Text-to-Speech APIs through a single platform, integrating services like AWS, Google Cloud, and IBM Watson for voice generation.
ElevenLabs
- Developer: ElevenLabs
- URL: ElevenLabs
- Description: Offers cutting-edge text-to-speech and voice cloning technology, allowing the generation of realistic voiceovers in multiple languages.