The field of AI continues to evolve rapidly, with several groundbreaking Large Language Models (LLMs) released in February 2024. Here’s a look at the most notable projects:

OLMo

  • Developer: AI2
  • URL: OLMo at Allen AI
  • Description: The first truly open-source LLM by AI2, including training code, data, and toolkits, licensed under Apache 2.0.

GPT-SoVITS

  • URL: GPT-SoVITS on GitHub
  • Description: A few-shot voice conversion and text-to-speech project aiming to revolutionize voice synthesis.

Surya

  • URL: Surya on GitHub
  • Description: A multilingual OCR toolkit enhancing document digitization with support for a wide range of languages.

CroissantLLM

  • URL: CroissantLLM on Arxiv
  • Description: A bilingual French-English LLM facilitating translation and content creation for dual-language applications.

BlackMamba

  • URL: BlackMamba on Arxiv
  • Description: Enhances scalability and efficiency in AI with a mixture of experts for state-space models.

Jua

  • URL: Jua at ML News
  • Description: A foundational AI model development project aiming to bridge the gap between AI capabilities and practical solutions.

DeepSeek-Coder

  • URL: DeepSeek-Coder
  • Description: An LLM for code generation by DeepSeek, enhancing developer productivity with insights into natural language understanding and code.

Code Llama 70B

  • Developer: Meta
  • URL: Code Llama 70B at Meta AI
  • Description: An advanced model for AI-assisted code generation, supporting various programming languages.

Qwen-7B

  • URL: Qwen-7B on Hugging Face
  • Description: Alibaba’s Tongyi Qianwen series entry, aiding businesses in adopting AI.

FLAN-T5

  • URL: FLAN-T5 on Arxiv
  • Description: A compact LLM with 780M parameters, achieving remarkable performance in summarizing meetings.

YaLM 100B

  • Developer: Yandex
  • URL: YaLM 100B on GitHub
  • Description: A 100 billion parameter LLM by Yandex, advancing generative neural network development.

LeoLM

  • Developer: LAION and Hessian.AI
  • URL: LeoLM Blog Post
  • Description: A “German Foundation Language Model” based on Meta’s Llama 2, optimized for German and English.

IGEL

  • URL: IGEL on Hugging Face
  • Description: An instruction-tuned German LLM based on a pre-trained adapted BLOOM model, fine-tuned with German instruction datasets.

Guanaco-65B

  • URL: Guanaco-65B on Hugging Face
  • Description: A fine-tuned chatbot model based on LLaMA, utilizing 4-bit QLoRA tuning for efficient memory usage.

Emu2-Chat

  • Developer: Hugging Face
  • URL: Search for Emu2-Chat on Hugging Face
  • Description: A cutting-edge multimodal chatbot based on the Emu2 model, capable of generating natural responses from text and visual inputs.

Nous-Hermes 2 Vision Alpha

Fuyu-8B

  • Developer: Adept
  • URL: Fuyu-8B
  • Description: Notable for its efficiency and speed, excelling in image understanding and supporting a wide range of image-related tasks.

IDEFICS 8 to 80b

  • Developer: Transformers Documentation
  • URL: IDEFICS
  • Description: Leverages publicly accessible data for multimodal understanding and language generation.

Lynx

  • Developer: Lynx LLM
  • URL: Lynx
  • Description: A local multimodal language model with over 20 variants, designed for sophisticated comprehension and generation tasks.

Cheetor

  • Developer: DCDM LLM
  • URL: Cheetor
  • Description: Processes complex vision-language instructions, excelling in reasoning across intricate scenarios.

LLaVA 13B & 7B

  • Developer: LLaVA-VL
  • URL: LLaVA
  • Description: Specialized for visual reasoning tasks, supporting research and application with simple yet effective capabilities.

This roundup of the latest LLM releases highlights the diverse and innovative approaches developers are taking to advance AI technology, offering a wide range of capabilities and specializations.