Whisper

  • Developer: openai
  • URL: Whisper
  • Description: Whisper is a robust speech recognition project by OpenAI, leveraging large-scale weak supervision. It is designed to provide high-quality transcription capabilities across various languages and domains, significantly advancing the field of automatic speech recognition (ASR).

NeMo by NVIDIA

  • Developer: NVIDIA
  • URL: NeMo
  • Description: NeMo is a toolkit for creating AI models, including text-to-speech (TTS), automatic speech recognition (ASR), and natural language processing (NLP). It offers pre-trained models and supports multi-GPU training.

Coqui TTS

  • Developer: Coqui AI
  • URL: Coqui TTS
  • Description: An open-source, deep learning toolkit for text-to-speech synthesis, supporting various TTS models and vocoders for creating natural-sounding speech.

Piper

  • Developer: Rhasspy
  • URL: Piper
  • Description: A fast, local neural text-to-speech system designed to be lightweight and efficient for edge devices.

Lobe Chat

  • Developer: LobeHub
  • URL: Lobe Chat
  • Description: Lobe Chat is an open-source, high-performance AI chat framework that supports TTS and STT, enabling the deployment of private ChatGPT-like applications.

Willow Inference Server

  • Developer: ToveraInc
  • URL: Willow Inference Server
  • Description: An optimized language inference server supporting ASR/STT, TTS, and LLM across various protocols, focused on privacy and local deployment.

Eden AI

  • Developer: Eden AI
  • URL: Eden AI
  • Description: Provides access to multiple Text-to-Speech APIs through a single platform, integrating services like AWS, Google Cloud, and IBM Watson for voice generation.

ElevenLabs

  • Developer: ElevenLabs
  • URL: ElevenLabs
  • Description: Offers cutting-edge text-to-speech and voice cloning technology, allowing the generation of realistic voiceovers in multiple languages.