Mistral Large 2: Leading the Way in Open Source AI Code Generation

25 July 2024
Performance accuracy on code generation benchmarks (all models were benchmarked through the same evaluation pipeline)

Mistral Large 2: Leading the Way in Open Source AI Code Generation

Mistral AI has announced Mistral Large 2, the latest iteration of its flagship model, setting a new state of the art (SOTA) in open-source code generation models. This new model…

LLaMA 3.1 Models Released: Open Source and Comparable to GPT-4

24 July 2024
llama 3.1 human evaluation

LLaMA 3.1 Models Released: Open Source and Comparable to GPT-4

LLAMA 3,1 models has been officially released, including the massive 405 billion-parameter LLaMA 3.1 405B model. Expanded context length to 128K, support for eight languages, and the introduction of LLaMA…

Unique3D Model Generates 3D Mesh from a Single Image in 30 Seconds

27 June 2024
unique 3d

Unique3D Model Generates 3D Mesh from a Single Image in 30 Seconds

Unique3D is a state-of-the-art model for 3D mesh generation from single images, noted for its efficiency and high fidelity. Unique3D is code and weights are available open source. This approach…

DenseAV Algorithm Learns Language from Videos

23 June 2024
DenseAV Algorithm

DenseAV Algorithm Learns Language from Videos

The algorithm DenseAV, developed at MIT, learns to understand the meaning of words and sentences by watching videos of people conversing. DenseAV outperformed other algorithms in tasks involving identifying objects…

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

21 June 2024
claude 3.5 sonnet by anthropic

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

Anthropic has introduced the new large language model Claude 3.5 Sonnet. It is now available on the ClaudeAI chatbot, Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5…

Qwen2: A Leap Forward in Open Source Language Models

7 June 2024
qwen2-72b comparison

Qwen2: A Leap Forward in Open Source Language Models

In a momentous development, the highly anticipated transition from Qwen1.5 to Qwen2 has finally arrived, marking a significant milestone in the realm of language models. Qwen 2 outperforms Llama 3…

Google Veo: A Model for Video Generation and Editing

19 May 2024
google veo

Google Veo: A Model for Video Generation and Editing

Google DeepMind has introduced the generative model Veo, capable of creating videos exceeding 60 seconds in Full HD resolution. Besides textual queries, the model can take input from images and…

Microsoft VASA-1: Platform for Animating Static Images

21 April 2024
microsoft vasa-1

Microsoft VASA-1: Platform for Animating Static Images

Microsoft has introduced the VASA-1 platform, which transforms a person’s image and audio recording with speech into a video with synchronized lip and head movements. The algorithm operates in real-time…

Apple MGIE: Multimodal Models for Image Editing

12 February 2024
apple mgie

Apple MGIE: Multimodal Models for Image Editing

Apple, in collaboration with the University of California, has developed the open-source MGIE model for image editing based on text input. This model tackles various editing tasks, including Photoshop-style image…

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

21 January 2024
AlphaGeometry

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

DeepMind has unveiled AlphaGeometry – a model capable of solving geometric problems at the level of International Mathematical Olympiad winners. AlphaGeometry solved 25 out of 30 Olympiad problems, while on…

Microsoft DragNUWA: Video Generation via Object Trajectories

15 January 2024

Microsoft DragNUWA: Video Generation via Object Trajectories

Microsoft has released the DragNUWA weights – a cross-domain video generation model that offers more precise control over the resulting output compared to similar models. Control is achieved by simultaneously…

VideoPoet: Google’s Language Model for Video Generation and Editing

23 December 2023
videopoet

VideoPoet: Google’s Language Model for Video Generation and Editing

Google has unveiled VideoPoet, a language model for multimodal video content processing capable of turning text and images into clips, styling pre-existing videos, and generating soundtrack for them without any…

Google Introduces Gemini, a Cutting-Edge Language Model Set

7 December 2023

Google Introduces Gemini, a Cutting-Edge Language Model Set

Google has announced the creation of Gemini, a set of three language models surpassing competitors in 30 out of 32 benchmarks. The top-tier model, Gemini Ultra, is available through an…

DeepMind GNoME Discovered 2 Million New Materials

3 December 2023

DeepMind GNoME Discovered 2 Million New Materials

DeepMind has developed the graph neural network GNoME, predicting material stability. GNoME has identified 2.2 million new materials, with 380 thousand deemed stable for application in developing computer chips, batteries,…

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

26 November 2023
Stable Video Diffusion

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

Stability AI has announced the release of Stable Video Diffusion, a duo of models that generate up to 4-second videos from an input image. Both models are available publicly. Importantly,…

LCM-LoRA: Real-Time Image Generation Neural Network

19 November 2023

LCM-LoRA: Real-Time Image Generation Neural Network

Researchers at Tsinghua University have developed the LCM-LoRA algorithm, revolutionizing real-time image generation from text descriptions or sketches. Consequently, this technology marks a significant advancement in the field. Popular text-to-image…

OpenAI DevDay 2023: GPTs, GPT-4 Turbo, and Other Updates from OpenAI

12 November 2023
openai devday2023

OpenAI DevDay 2023: GPTs, GPT-4 Turbo, and Other Updates from OpenAI

OpenAI introduced over ten products and features for developers at DevDay 2023. Here’s a rundown of the new models and API updates: The GPT-4 Turbo model, trained on data up…

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

1 October 2023
Mistral 7B vs Llama 2

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

The Mistral AI team has unveiled the remarkable Mistral 7B – an open-source language model with a staggering 7.3 billion parameters, surpassing the significantly larger Llama 2 13B model in…

FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

24 September 2023
FLM 101B evaluating growth strategy

FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

Researchers from Beijing University present FLM-101B, an open-source large language model (LLM) with 101 billion parameters trained from scratch with a budget of only $100K. Training LLMs at large scales…

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

14 September 2023
Würstchen approach

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

Würstchen is an open text-to-image model that generates images faster than diffusion models like Stable Diffusion while consuming significantly less memory, achieving comparable results. The approach is based on a…

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

11 September 2023
persimmon-8b-llm

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

Researchers from Adept have introduced the open-source language model Persimmon-8B with a 16k token context, which is four times larger than the most compact Llama 2 and text-davinci-002 used in…