EzAudio: Open Source Hyperrealistic Text-to-Audio Model

19 September 2024
ezaudio text-to-audio model generation ai

EzAudio: Open Source Hyperrealistic Text-to-Audio Model

EzAudio, a new transformer-based text-to-audio (T2A) diffusion model developed by researchers from Tencent AI Lab and Johns Hopkins University. EzAudio addresses key challenges in T2A generation, including generation quality, computational…

xLAM and xGen-Sales: Salesforce’s Open Source AI Models for Sales Automation

9 September 2024
salesforce AI models open sourced xlam

xLAM and xGen-Sales: Salesforce’s Open Source AI Models for Sales Automation

Salesforce has taken a significant leap in AI development with the release of its xLAM family, introducing Large Action Models (LAMs) to enable more efficient and autonomous workflows. Unlike Large…

Mini-Omni: Open-Source Model for Real-Time Speech Interaction

2 September 2024
mini-omni model architecture

Mini-Omni: Open-Source Model for Real-Time Speech Interaction

Current academic language models still rely on external Text-to-Speech (TTS) systems, causing undesirable latency in speech synthesis. To address this, the Mini-Omni model introduces an audio-based, end-to-end conversational capability that…

LongWriter: Open-Source Framework for Generating Texts Beyond 10,000 Words

19 August 2024
agentwrite

LongWriter: Open-Source Framework for Generating Texts Beyond 10,000 Words

LongWriter is a framework and a set of large language models (LLMs) designed specifically to enable ultra-long text generation, often exceeding 10,000 words while maintaining coherence, quality, and relevance. It…

Mistral Large 2: Leading the Way in Open Source AI Code Generation

25 July 2024
Performance accuracy on code generation benchmarks (all models were benchmarked through the same evaluation pipeline)

Mistral Large 2: Leading the Way in Open Source AI Code Generation

Mistral AI has announced Mistral Large 2, the latest iteration of its flagship model, setting a new state of the art (SOTA) in open-source code generation models. This new model…

LLaMA 3.1 Models Released: Open Source and Comparable to GPT-4

24 July 2024
llama 3.1 human evaluation

LLaMA 3.1 Models Released: Open Source and Comparable to GPT-4

LLAMA 3,1 models has been officially released, including the massive 405 billion-parameter LLaMA 3.1 405B model. Expanded context length to 128K, support for eight languages, and the introduction of LLaMA…

MindsDB: Transforming Databases with Open Source AI

12 July 2024
ai for database enterprise

MindsDB: Transforming Databases with Open Source AI

MindsDB is transforming how AI integrates with databases by embedding machine learning models directly within them. This approach allows users to leverage AI capabilities without altering their existing database infrastructure.…

Unique3D Model Generates 3D Mesh from a Single Image in 30 Seconds

27 June 2024
unique 3d

Unique3D Model Generates 3D Mesh from a Single Image in 30 Seconds

Unique3D is a state-of-the-art model for 3D mesh generation from single images, noted for its efficiency and high fidelity. Unique3D is code and weights are available open source. This approach…

Zyda: 1.3T Dataset for Open Language Modeling

12 June 2024
zyda dataset composition

Zyda: 1.3T Dataset for Open Language Modeling

Zyda is a 1.3 trillion-token open-source dataset designed for open language modeling. Zyda integrates a range of high-quality open datasets, including RefinedWeb, Starcoder, C4, Pile, enhancing them through comprehensive filtering…

Hugging Face and Pollen Robotics Introduce Reachy2 – an Open-Source Robot for Household Tasks

10 June 2024

Hugging Face and Pollen Robotics Introduce Reachy2 – an Open-Source Robot for Household Tasks

Hugging Face and Pollen Robotics unveiled the anthropomorphic robot Reachy2, whose training dataset and model are open-source. Reachy2 performs household tasks and interacts safely with people and pets. Pollen Robotics…

Qwen2: A Leap Forward in Open Source Language Models

7 June 2024
qwen2-72b comparison

Qwen2: A Leap Forward in Open Source Language Models

In a momentous development, the highly anticipated transition from Qwen1.5 to Qwen2 has finally arrived, marking a significant milestone in the realm of language models. Qwen 2 outperforms Llama 3…

Microsoft ViSNet: Predicting Molecule Activity

3 March 2024
microsoft visnet

Microsoft ViSNet: Predicting Molecule Activity

Microsoft has unveiled ViSNet – a graph neural network modeling the geometry of complex molecules to predict their activity. ViSNet has the potential to significantly expedite the search for and…

Apple MGIE: Multimodal Models for Image Editing

12 February 2024
apple mgie

Apple MGIE: Multimodal Models for Image Editing

Apple, in collaboration with the University of California, has developed the open-source MGIE model for image editing based on text input. This model tackles various editing tasks, including Photoshop-style image…

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

21 January 2024
AlphaGeometry

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

DeepMind has unveiled AlphaGeometry – a model capable of solving geometric problems at the level of International Mathematical Olympiad winners. AlphaGeometry solved 25 out of 30 Olympiad problems, while on…

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

26 November 2023
Stable Video Diffusion

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

Stability AI has announced the release of Stable Video Diffusion, a duo of models that generate up to 4-second videos from an input image. Both models are available publicly. Importantly,…

LCM-LoRA: Real-Time Image Generation Neural Network

19 November 2023

LCM-LoRA: Real-Time Image Generation Neural Network

Researchers at Tsinghua University have developed the LCM-LoRA algorithm, revolutionizing real-time image generation from text descriptions or sketches. Consequently, this technology marks a significant advancement in the field. Popular text-to-image…

Microsoft AutoGen: A Framework for Configuring LLM Agents

8 October 2023
AutoGen framework

Microsoft AutoGen: A Framework for Configuring LLM Agents

Microsoft has unveiled AutoGen, an open-source library designed for creating and configuring LLM agents. Moreover, these are individual sessions of large language models that can collaborate for collective problem-solving. LLM…

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

1 October 2023
Mistral 7B vs Llama 2

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

The Mistral AI team has unveiled the remarkable Mistral 7B – an open-source language model with a staggering 7.3 billion parameters, surpassing the significantly larger Llama 2 13B model in…

FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

24 September 2023
FLM 101B evaluating growth strategy

FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

Researchers from Beijing University present FLM-101B, an open-source large language model (LLM) with 101 billion parameters trained from scratch with a budget of only $100K. Training LLMs at large scales…

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

14 September 2023
Würstchen approach

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

Würstchen is an open text-to-image model that generates images faster than diffusion models like Stable Diffusion while consuming significantly less memory, achieving comparable results. The approach is based on a…

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

11 September 2023
persimmon-8b-llm

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

Researchers from Adept have introduced the open-source language model Persimmon-8B with a 16k token context, which is four times larger than the most compact Llama 2 and text-davinci-002 used in…