Mistral Agents API: AI Agent Framework with Web Search, Code Generation, and Image Generation Capabilities

28 May 2025
mistral api 2

Mistral Agents API: AI Agent Framework with Web Search, Code Generation, and Image Generation Capabilities

French startup Mistral AI introduced Agents API — a framework for creating autonomous AI agents with built-in connectors, persistent memory, and orchestration capabilities. Developers can create unlimited agents and build…

Visual-ARFT: Multimodal AI Agents Outperform GPT-4o by 18.6% in Complex Visual Tasks

22 May 2025
Диаграмма процесса обучения Visual-ARFT

Visual-ARFT: Multimodal AI Agents Outperform GPT-4o by 18.6% in Complex Visual Tasks

A research team from Shanghai Jiao Tong University and Shanghai Artificial Intelligence Laboratory has introduced Visual Agentic Reinforcement Fine-Tuning (Visual-ARFT) — a new approach to training large multimodal models with…

NVIDIA Isaac 5.0: Enhanced Sensor Physics and Expanded Synthetic Data Generation

19 May 2025
NVIDIA Isaac robotics platform showing a humanoid robot interacting with objects

NVIDIA Isaac 5.0: Enhanced Sensor Physics and Expanded Synthetic Data Generation

NVIDIA continues to push the boundaries of AI-driven robotics with significant updates to its Isaac ecosystem, announced at COMPUTEX 2025. These innovations address key challenges in robotics development by enhancing…

ZEROSEARCH: A Framework That Cuts LLM Search Training Costs by 88%

9 May 2025
zerosearch method

ZEROSEARCH: A Framework That Cuts LLM Search Training Costs by 88%

Alibaba’s NLP research team has officially open-sourced ZEROSEARCH, a complete framework for training LLMs to search without using real search engines. ZEROSEARCH builds on a key insight: LLMs have already…

Phi-4-reasoning: Microsoft’s Breakthrough in AI Thinking

4 May 2025
phi-4-reasoning-model

Phi-4-reasoning: Microsoft’s Breakthrough in AI Thinking

Microsoft recently unveiled Phi-4-reasoning, a 14-billion parameter model that achieves exceptional performance on complex reasoning tasks, outperforming models 5-47 times larger while requiring significantly less computational resources, with developers able…

DeepMath-103K: Advancing AI Reasoning Through Challenge

21 April 2025
deepmath reasoning dataset ai

DeepMath-103K: Advancing AI Reasoning Through Challenge

Mathematical reasoning stands as a crucial benchmark for artificial intelligence systems, requiring logical deduction, symbolic manipulation, and multi-step problem-solving. Recent breakthroughs in AI reasoning have been significantly driven by reinforcement…

MedSAM2: Open Source SOTA 3D Medical Image and Video Segmentation Model

13 April 2025
medsam2 human in the loop

MedSAM2: Open Source SOTA 3D Medical Image and Video Segmentation Model

Medical image segmentation plays a critical role in precision medicine, enabling more accurate diagnosis, treatment planning, and quantitative analysis. While significant progress has been made in developing both specialized and…

Claude for Education: Revolutionizing Higher Education with AI-Powered Learning

3 April 2025
claude education

Claude for Education: Revolutionizing Higher Education with AI-Powered Learning

Anthropic has released Claude for Education, specifically designed for implementation in universities and other higher education institutions. While the classic chatbot provides direct answers to questions, Claude for Education uses…

Llama Nemotron: NVIDIA Launches Family of Open Reasoning AI Models Overtaking DeepSeek R1

19 March 2025
llama nemotron 3.3

Llama Nemotron: NVIDIA Launches Family of Open Reasoning AI Models Overtaking DeepSeek R1

NVIDIA has announced the open Llama Nemotron family of models with reasoning capabilities, designed to provide a business-ready foundation for creating advanced AI agents. These models can work independently or…

Chain-of-Experts: Novel Approach Improving MoE Efficiency with up to 42% Memory Reduction

11 March 2025
CoE

Chain-of-Experts: Novel Approach Improving MoE Efficiency with up to 42% Memory Reduction

Chain-of-Experts (CoE) – a novel approach fundamentally changing how sparse language models process information, delivering better performance with significantly less memory. This breakthrough addresses key limitations in current Mixture-of-Experts (MoE)…

R1-Onevision: Open Source 7B-Parameter Model Outperforming GPT-4o in Maths and Reasoning Tasks

27 February 2025
r1 demo

R1-Onevision: Open Source 7B-Parameter Model Outperforming GPT-4o in Maths and Reasoning Tasks

Researchers from Zhejiang University have released R1-Onevision, a 7B parameters multimodal reasoning model that processes and analyzes visual inputs with unprecedented logical precision, capable of understanding complex mathematical, scientific, and…

Step-Video-T2V: Text-to-Video Open-Source Model Achieves 16x Video Compression Breakthrough

20 February 2025

Step-Video-T2V: Text-to-Video Open-Source Model Achieves 16x Video Compression Breakthrough

Researchers from Stepfun AI have developed Step-Video-T2V, a 30-billion-parameter text-to-video model that generates videos up to 204 frames in length, 544×992 resolution, capable of understanding both Chinese and English prompts.…

Adobe Drops Industry-First Commercially Safe AI Video Model

13 February 2025
adobe firefly model

Adobe Drops Industry-First Commercially Safe AI Video Model

Adobe has launched its Firefly video model in public, marking the first generative AI tool built specifically for commercial use, addressing enterprise concerns about IP rights and legal safety in…

EPFL Study: Language Models Don’t Translate Into English – They Operate Through Concepts

30 January 2025
unnamed

EPFL Study: Language Models Don’t Translate Into English – They Operate Through Concepts

New research from EPFL sheds light on the internal mechanisms of multilingual data processing in LLMs, which is critical for understanding how modern language models work and how to optimize…

ByteDance Unveil TA-TiTok Tokenizer Achieving SOTA in Text-to-Image Generation Using Only Public Data

19 January 2025
ta-titok and maskgen research

ByteDance Unveil TA-TiTok Tokenizer Achieving SOTA in Text-to-Image Generation Using Only Public Data

Researchers from ByteDance and POSTECH introduced TA-TiTok (Text-Aware Transformer-based 1-Dimensional Tokenizer), a novel approach to making text-to-image AI models more accessible and efficient. Their work demonstrates through MaskGen models how…

MiniMax-01: 4M Context Length Benchmark Leader Powered by Lightning Attention

15 January 2025

MiniMax-01: 4M Context Length Benchmark Leader Powered by Lightning Attention

MiniMax has open-sourced its latest MiniMax-01 series, introducing two models that push the boundaries of context length and attention mechanisms: MiniMax-Text-01 for language processing and MiniMax-VL-01 for visual-language tasks. The…

Nvidia Drops Monster AI Update at CES 2025: Local Foundation Models Meet New RTX 50 Series

7 January 2025
nvidia update ces 2025

Nvidia Drops Monster AI Update at CES 2025: Local Foundation Models Meet New RTX 50 Series

Nvidia just announced a major shift in consumer AI computing at CES 2025, combining new GPUs with a platform for running foundation models locally. The announcement includes next-gen RTX 50…

ArtAug: Open Source Framework for Image Generation Models Enhancement

18 December 2024
Enhancing Text-to-Image Generation

ArtAug: Open Source Framework for Image Generation Models Enhancement

East China Normal University and Alibaba Group researchers have introduced ArtAug, a framework that enhances text-to-image generation through synthesis-understanding interaction. This approach significantly improves image quality without requiring extensive manual…

Sora Turbo: OpenAI’s Enhanced Video Generation Model Goes Public

10 December 2024
SORA AI video generation model goes public

Sora Turbo: OpenAI’s Enhanced Video Generation Model Goes Public

OpenAI has announced the public release of Sora Turbo, a significantly enhanced version of their hyperrealistic text-to-video generation model. Announced during the company’s “12 Days of OpenAI” holiday series by…

Building an AI-Powered Game: A Deep Dive into DeepLearning.AI’s Latest Free Course Making AI Game Development Accessible

2 December 2024
deeplearning ai game development course free

Building an AI-Powered Game: A Deep Dive into DeepLearning.AI’s Latest Free Course Making AI Game Development Accessible

DeepLearning.AI’s newly released course, “Building an AI-Powered Game,” represents a significant step forward in making AI game development accessible to developers and enthusiasts alike. This comprehensive analysis explores how the…

X-MeshGraphNet: NVIDIA’s Scalable Solution for Physics Simulation Using Graph Neural Networks

27 November 2024
Illustration of the partitioning scheme with Halo on a Koenigsegg car.

X-MeshGraphNet: NVIDIA’s Scalable Solution for Physics Simulation Using Graph Neural Networks

NVIDIA researchers have introduced X-MeshGraphNet, a novel extension of MeshGraphNet that addresses key scalability and practical limitations in physics simulation. The open-source framework, now available through NVIDIA Modulus, enables efficient…