Videos / Neural Networks and Deep Learning

Ditto: Open Framework for Text-Instruction-Based Video Style and Object Editing with 99% Frame Consistency

24 October 2025

Ditto: Open Framework for Text-Instruction-Based Video Style and Object Editing with 99% Frame Consistency

Researchers from HKUST, Ant Group, Zhejiang University, and Northeastern University introduced Ditto — a comprehensive open framework addressing the training data scarcity problem in text-instruction-based video editing. The developers created…

What’s Hiding in Your Video Metadata, and How Fast Can You Find It with FFprobe for Smarter Media Analysis

- Spotlight

What’s Hiding in Your Video Metadata, and How Fast Can You Find It with FFprobe for Smarter Media Analysis

- Spotlight

Video files carry more than just images and sound. Hidden inside are details like codec type, resolution, frame rate, duration, and even tags that reveal how the file was created.…

Hailuo AI Expands Video Creation Capabilities with Image-to-Video Feature

9 October 2024

Hailuo AI Expands Video Creation Capabilities with Image-to-Video Feature

MiniMax’s Hailuo AI has launched its new Image-to-Video feature, empowering creators to transform static images into dynamic video content. This update enhances the platform, which initially supported only text-to-video generation…

VFusion3D: Advanced 3D Generative Modeling via Scalable Video Diffusion

10 August 2024

VFusion3D: Advanced 3D Generative Modeling via Scalable Video Diffusion

VFusion3D introduces a novel method in 3D generative modeling by utilizing video diffusion models to overcome the scarcity of 3D data. The model generates meshes in .obj format, which can…

DenseAV Algorithm Learns Language from Videos

23 June 2024

DenseAV Algorithm Learns Language from Videos

The algorithm DenseAV, developed at MIT, learns to understand the meaning of words and sentences by watching videos of people conversing. DenseAV outperformed other algorithms in tasks involving identifying objects…

Dream Machine by Luma: New AI Video Generation Platform

14 June 2024

Dream Machine by Luma: New AI Video Generation Platform

Dream Machine by Luma AI is an advanced AI model designed to create high-quality, realistic, and fantastical videos from text instructions and images. Built on a scalable, efficient, and multimodal…

Google Veo: A Model for Video Generation and Editing

19 May 2024

Google Veo: A Model for Video Generation and Editing

Google DeepMind has introduced the generative model Veo, capable of creating videos exceeding 60 seconds in Full HD resolution. Besides textual queries, the model can take input from images and…

Sora: OpenAI’s Groundbreaking Text-to-Image Diffusion Model

18 February 2024

18 February 2024

Sora: OpenAI’s Groundbreaking Text-to-Image Diffusion Model

18 February 2024

OpenAI has unveiled Sora, a diffusion-based text-to-image model capable of generating 60-second videos. Compared to competitors like Runway, Pika, Stability AI, and Google, OpenAI’s model boasts high-resolution (Full HD) output,…

Microsoft DragNUWA: Video Generation via Object Trajectories

15 January 2024

Microsoft DragNUWA: Video Generation via Object Trajectories

Microsoft has released the DragNUWA weights – a cross-domain video generation model that offers more precise control over the resulting output compared to similar models. Control is achieved by simultaneously…

Pika 1.0: A Web Platform for Video Generation

7 January 2024

7 January 2024

Pika 1.0: A Web Platform for Video Generation

7 January 2024

Pika Labs startup has launched Pika 1.0 – a free web platform for generating and editing videos using text-based queries. The service creates both realistic videos and 3D animation in…

VideoPoet: Google’s Language Model for Video Generation and Editing

23 December 2023

VideoPoet: Google’s Language Model for Video Generation and Editing

Google has unveiled VideoPoet, a language model for multimodal video content processing capable of turning text and images into clips, styling pre-existing videos, and generating soundtrack for them without any…

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

26 November 2023

Stable Video Diffusion: Stability AI’s Image-Based Video Generator

Stability AI has announced the release of Stable Video Diffusion, a duo of models that generate up to 4-second videos from an input image. Both models are available publicly. Importantly,…

“Deepdub Go” Empowers Content Creators with AI for Video Dubbing

9 July 2023

ai for video dubbing - neural network based service

“Deepdub Go” Empowers Content Creators with AI for Video Dubbing

Israeli startup Deepdub has unveiled its groundbreaking service, Deepdub Go, which utilizes AI for dubbing to automatically dub videos in 65 languages. This innovative platform targets game development studios, advertising…

MAGVIT: Open Source Generative Video Transformer 10-in-1

29 June 2023

MAGVIT: Open Source Generative Video Transformer 10-in-1

Researchers from Carnegie Mellon University, Google Research, and the University of Georgia have introduced MAGVIT (Masked Generative Video Transformer), an open-source video generation model. MAGVIT is a unified model that…

New Datasets for Object Tracking

8 November 2018

New Datasets for Object Tracking

Object tracking in the wild is far from being solved. Existing object trackers do quite a good job on the established datasets (e.g., VOT, OTB), but these datasets are relatively…

3D Hair Reconstruction Out of In-the-Wild Videos

22 October 2018

3D Hair Reconstruction Out of In-the-Wild Videos

3D hair reconstruction is a problem with numerous applications in different areas such as Virtual Reality, Augmented Reality, video games, medical software, etc. As a non-trivial problem, researchers have proposed…

Vid2Vid – Conditional GANs for Video-to-Video Synthesis

3 September 2018

vid2vid-video-to-video-synthesis-e1535641547242

Vid2Vid – Conditional GANs for Video-to-Video Synthesis

Researchers from NVIDIA and MIT’s Computer Science and Artificial Intelligence Lab have proposed a novel method for video-to-video synthesis, showing impressive results. The proposed method – Vid2Vid – can synthesize…

Everybody Dance Now: a New Approach to “Do As I Do” Motion Transfer

30 August 2018

Everybody Dance Now: a New Approach to “Do As I Do” Motion Transfer

Not very good at dancing? Not a problem anymore! Now you can easily impress your friends with a stunning video, where you dance like a superstar. Researchers from UC Berkeley…

ReCoNet: Fast and Accurate Real-time Video Style Transfer

25 July 2018

ReCoNet: Fast and Accurate Real-time Video Style Transfer

A real-time coherent video style transfer network – ReCoNet – is proposed by a group of researchers from the University of Hong Kong as a state-of-the-art approach to video style…