Computer Vision / Neural Networks and Deep Learning

AI Models Are 13% Worse Than Humans at Detecting Generated ASMR Videos

18 December 2025

AI Models Are 13% Worse Than Humans at Detecting Generated ASMR Videos

Researchers from CUHK, NUS, University of Oxford, and Video Rebirth introduced Video Reality Test — the first benchmark that tests whether modern AI models can create videos indistinguishable from real…

3D-R1: Open Source Reasoning Model for 3D Scenes Outperforms State-of-the-Art Methods by 10% on 3D Benchmarks

6 August 2025

3D-R1: Open Source Reasoning Model for 3D Scenes Outperforms State-of-the-Art Methods by 10% on 3D Benchmarks

Researchers from Shanghai University of Engineering Science and Peking University presented 3D-R1 — a new foundation model that significantly improves reasoning capabilities in three-dimensional vision-language models (VLM). The model demonstrates an average performance…

NVIDIA Isaac 5.0: Enhanced Sensor Physics and Expanded Synthetic Data Generation

19 May 2025

19 May 2025

NVIDIA Isaac robotics platform showing a humanoid robot interacting with objects

NVIDIA Isaac 5.0: Enhanced Sensor Physics and Expanded Synthetic Data Generation

19 May 2025

NVIDIA continues to push the boundaries of AI-driven robotics with significant updates to its Isaac ecosystem, announced at COMPUTEX 2025. These innovations address key challenges in robotics development by enhancing…

X-MeshGraphNet: NVIDIA’s Scalable Solution for Physics Simulation Using Graph Neural Networks

27 November 2024

Illustration of the partitioning scheme with Halo on a Koenigsegg car.

X-MeshGraphNet: NVIDIA’s Scalable Solution for Physics Simulation Using Graph Neural Networks

NVIDIA researchers have introduced X-MeshGraphNet, a novel extension of MeshGraphNet that addresses key scalability and practical limitations in physics simulation. The open-source framework, now available through NVIDIA Modulus, enables efficient…

Molmo: Open Source Multimodal Vision-Language Models Outperform Gemini 1.5 and Claude 3.5

26 September 2024

Molmo: Open Source Multimodal Vision-Language Models Outperform Gemini 1.5 and Claude 3.5

Molmo is a new series of multimodal vision-language models (VLMs) created by researchers at the Allen Institute for AI and the University of Washington. The Molmo family outperforms many state-of-the-art…

How AI Helped King Studio Develop 13,755 Levels for Candy Crush Saga

4 July 2024

How AI Helped King Studio Develop 13,755 Levels for Candy Crush Saga

King, the developer behind the popular mobile game Candy Crush Saga, actively integrates artificial intelligence into the game’s development and optimization process. In a recent interview, Sahar Asadi from AI…

NVIDIA DrEureka Model Accelerates Robot Training Faster Than Humans

12 May 2024

NVIDIA DrEureka Model Accelerates Robot Training Faster Than Humans

NVIDIA has demonstrated that large language models can expedite robot training. Robots with four limbs trained using the DrEureka model outperform standard learning systems by 34% in real-world movement speed…

VideoPoet: Google’s Language Model for Video Generation and Editing

23 December 2023

VideoPoet: Google’s Language Model for Video Generation and Editing

Google has unveiled VideoPoet, a language model for multimodal video content processing capable of turning text and images into clips, styling pre-existing videos, and generating soundtrack for them without any…

LCM-LoRA: Real-Time Image Generation Neural Network

19 November 2023

LCM-LoRA: Real-Time Image Generation Neural Network

Researchers at Tsinghua University have developed the LCM-LoRA algorithm, revolutionizing real-time image generation from text descriptions or sketches. Consequently, this technology marks a significant advancement in the field. Popular text-to-image…

NVIDIA Eureka: Agent for Autonomous Robot Learning

22 October 2023

22 October 2023

NVIDIA Eureka: Agent for Autonomous Robot Learning

22 October 2023

NVIDIA has unveiled Eureka, an open-source agent based on GPT-4, designed to teach robots complex skills such as performing tricks and handling scissors. This breakthrough leverages the power of large…

Google Introduces Image Generation in Search

15 October 2023

15 October 2023

Google Introduces Image Generation in Search

15 October 2023

Google has announced the integration of image generation in search results based on descriptions and several other AI features. Furthermore, this tool is built on the Imagen model and allows…

MIT Researchers Utilize Neural Network for Remote Diagnosis of Neurological Disorders

17 September 2023

нейросеть удаленно диагностирует неврологические расстройства

MIT Researchers Utilize Neural Network for Remote Diagnosis of Neurological Disorders

Scientists at MIT have developed a neural network that analyzes video recordings of patients with motor or neurological disorders and assesses their clinical condition in real time. This tool operates…

Prithvi: NASA Model and Dataset for Ecological Phenomena Analysis

6 August 2023

Prithvi: NASA Model and Dataset for Ecological Phenomena Analysis

NASA and IBM have unveiled the open-source model Prithvi, designed to empower scientists in tracking the consequences of climate change, monitoring deforestation, predicting agricultural crop yields, and analyzing greenhouse gas…

PIGINet: Generating Optimal Sequence of Robot Actions

30 July 2023

PIGINet: Generating Optimal Sequence of Robot Actions

MIT researchers have introduced PIGINet, a neural network designed to teach robots how to navigate through various tasks. PIGINet evaluates potential action sequences based on task descriptions, scene images, and…

PACGen: Personalized and Controllable Text-to-Image Generation

7 July 2023

PACGen: Personalized and Controllable Text-to-Image Generation

Researchers from the University of Wisconsin-Madison have introduced a text-to-image diffusion model called PACGen (Personalized and Controllable Text-to-Image Generation) for transferring objects from one image to a new scene generated…

Google Try-on: Try Clothes Virtually with Realistic Models

18 June 2023

18 June 2023

Google Try-on: Try Clothes Virtually with Realistic Models

18 June 2023

Google has introduced Try-on, a diffusion model that allows users of the “Shopping” service to try on clothes on models with different body types and skin tones. With just one…

Stability AI Publishes PickScore Dataset and Evaluation Function for Generative Models

6 June 2023

Stability AI Publishes PickScore Dataset and Evaluation Function for Generative Models

Stability AI, in collaboration with Tel Aviv University, has released the PickScore dataset, a collection of over 500,000 images accompanied by user ratings. They have also introduced the PickScore evaluation…

John Deere Unveils the World’s First Self-Driving Tractor for Precision Farming

7 January 2022

John Deere Unveils the World’s First Self-Driving Tractor for Precision Farming

John Deere, the world’s largest manufacturer of agricultural machinery, has started mass production of the first self-driving tractor. It can independently perform simple tasks and transmit data to the control…

Artificial Intelligence System Predicts Worsening Patient in Emergency Room

14 May 2021

14 May 2021

Artificial Intelligence System Predicts Worsening Patient in Emergency Room

14 May 2021

A group of researchers has created an approach to automatically predict the deterioration of patients’ condition using a neural network. The system was trained on chest x-rays and non-visual clinical…

15 April 2020

Deep Neural Network Learns to “See” Through Obstructions

15 April 2020

A group of researchers led by Yu-Lun Liu has proposed a deep learning-based method that removes unwanted obstructions such as raindrops, fences or window reflections from short videos. Very often,…