Mistral Large 2: Leading the Way in Open Source AI Code Generation

25 July 2024
Performance accuracy on code generation benchmarks (all models were benchmarked through the same evaluation pipeline)

Mistral Large 2: Leading the Way in Open Source AI Code Generation

Mistral AI has announced Mistral Large 2, the latest iteration of its flagship model, setting a new state of the art (SOTA) in open-source code generation models. This new model…

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

21 June 2024
claude 3.5 sonnet by anthropic

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

Anthropic has introduced the new large language model Claude 3.5 Sonnet. It is now available on the ClaudeAI chatbot, Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5…

Zyda: 1.3T Dataset for Open Language Modeling

12 June 2024
zyda dataset composition

Zyda: 1.3T Dataset for Open Language Modeling

Zyda is a 1.3 trillion-token open-source dataset designed for open language modeling. Zyda integrates a range of high-quality open datasets, including RefinedWeb, Starcoder, C4, Pile, enhancing them through comprehensive filtering…

NVIDIA DrEureka Model Accelerates Robot Training Faster Than Humans

12 May 2024
nvidia dreureka

NVIDIA DrEureka Model Accelerates Robot Training Faster Than Humans

NVIDIA has demonstrated that large language models can expedite robot training. Robots with four limbs trained using the DrEureka model outperform standard learning systems by 34% in real-world movement speed…

Google RecurrentGemma: Next-Gen Local Language Model

14 April 2024
recurrentgemma пщщпду

Google RecurrentGemma: Next-Gen Local Language Model

Google has introduced the RecurrentGemma language model, designed to operate locally on devices with limited resources such as smartphones, personal computers, and smart speakers. The new architecture from Google significantly…

Gretel: The Largest Open Text-to-SQL Dataset

7 April 2024
gretel dataset sql

Gretel: The Largest Open Text-to-SQL Dataset

Gretel, a startup specializing in generating high-quality synthetic data, has announced the creation of the largest open text-to-SQL dataset aimed at accelerating the development of no-code analytics tools. The dataset…

Microsoft ViSNet: Predicting Molecule Activity

3 March 2024
microsoft visnet

Microsoft ViSNet: Predicting Molecule Activity

Microsoft has unveiled ViSNet – a graph neural network modeling the geometry of complex molecules to predict their activity. ViSNet has the potential to significantly expedite the search for and…

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

21 January 2024
AlphaGeometry

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

DeepMind has unveiled AlphaGeometry – a model capable of solving geometric problems at the level of International Mathematical Olympiad winners. AlphaGeometry solved 25 out of 30 Olympiad problems, while on…

FractalGPT Launches Question-Answer Agent for Handling Loaded Documents

14 December 2023
fractalgpt

FractalGPT Launches Question-Answer Agent for Handling Loaded Documents

Developers at FractalGPT have rolled out a QA agent FractalGPT designed for interacting with documents, allowing users to engage in dialogues using uploaded PDF, TXT, and DOCX files. Key Features…

OpenAI DevDay 2023: GPTs, GPT-4 Turbo, and Other Updates from OpenAI

12 November 2023
openai devday2023

OpenAI DevDay 2023: GPTs, GPT-4 Turbo, and Other Updates from OpenAI

OpenAI introduced over ten products and features for developers at DevDay 2023. Here’s a rundown of the new models and API updates: The GPT-4 Turbo model, trained on data up…

Microsoft LeMa: Boosting Language Model Accuracy in Math

4 November 2023
Microsoft LeMa

Microsoft LeMa: Boosting Language Model Accuracy in Math

Microsoft researchers have introduced LeMa (Learning from Mistakes), an open-source algorithm designed to enhance the ability of large language models to solve mathematical problems. LeMa encourages models to learn from…

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

1 October 2023
Mistral 7B vs Llama 2

“Compact Giant” Mistral 7B Outperforms Llama 2 13B and Llama 34B

The Mistral AI team has unveiled the remarkable Mistral 7B – an open-source language model with a staggering 7.3 billion parameters, surpassing the significantly larger Llama 2 13B model in…

MIT Releases Free Lecture Course on TinyML & Efficient DL Computing on Youtube

29 September 2023
TinyML & Efficient DL Computing

MIT Releases Free Lecture Course on TinyML & Efficient DL Computing on Youtube

In recent years, large language and diffusion models have showcased impressive results. However, their demands on computational resources and memory consumption pose significant challenges for researchers and developers. The TinyML…

Google Apps Integration Elevates Bard Chatbot’s Capabilities

19 September 2023
bard_with_google_services

Google Apps Integration Elevates Bard Chatbot’s Capabilities

Google has rolled out an update for the Bard chatbot, introducing seamless integration with various Google apps, such as Gmail, Docs, Sheets, Maps, and YouTube. This integration propels Bard ahead…

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

11 September 2023
persimmon-8b-llm

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

Researchers from Adept have introduced the open-source language model Persimmon-8B with a 16k token context, which is four times larger than the most compact Llama 2 and text-davinci-002 used in…

Arthur Bench: Framework for Evaluating Language Models

20 August 2023
arthur bench

Arthur Bench: Framework for Evaluating Language Models

American startup Arthur has released an open-source framework called Bench for evaluating and comparing the performance of large language models. This tool enables users to select the most suitable language…

ReLoRA: Method for Enhancing Performance in Training Large Language Models

16 August 2023
relora method

ReLoRA: Method for Enhancing Performance in Training Large Language Models

ReLoRA is a technique for training large transformer-based language models using low-rank matrices, aimed at boosting training efficiency. The effectiveness of this method increases with the scale of the models.…

Wix AI: Building Websites Using Chatbot Technology

23 July 2023
wix ai

Wix AI: Building Websites Using Chatbot Technology

Wix, the website creation service, has announced the launch of its chatbot, Wix AI, which allows users to create and modify websites using natural language queries. Additionally, the tool will…

Llama 2 and Llama-2-Chat: A New Generation of Open Source Language Models

19 July 2023
Llama 2 update

Llama 2 and Llama-2-Chat: A New Generation of Open Source Language Models

The new generation of Llama models comprises three large language models, namely Llama 2 with 7, 13, and 70 billion parameters, along with the fine-tuned conversational models Llama-2-Chat 7B, 34B,…

Google Bard Update: Image Processing and New Language Support

16 July 2023
google bard

Google Bard Update: Image Processing and New Language Support

Google Bard has undergone an update, expanding its functionality to 46 languages across more than 200 countries, including countries in Europe and Brazil. The latest features include image processing, dialog…

“Deepdub Go” Empowers Content Creators with AI for Video Dubbing

9 July 2023
ai for video dubbing - neural network based service

“Deepdub Go” Empowers Content Creators with AI for Video Dubbing

Israeli startup Deepdub has unveiled its groundbreaking service, Deepdub Go, which utilizes AI for dubbing to automatically dub videos in 65 languages. This innovative platform targets game development studios, advertising…