Generative AI / Neural Networks and Deep Learning

Show-o2: Open-source 7B multimodal model outperforms 14B models on benchmarks using significantly less training data

26 June 2025

Show-o2: Open-source 7B multimodal model outperforms 14B models on benchmarks using significantly less training data

Researchers from Show Lab at the National University of Singapore and ByteDance introduced Show-o2 — a second-generation multimodal model that demonstrates superior results in image and video understanding and generation…

Step-Video-T2V: Text-to-Video Open-Source Model Achieves 16x Video Compression Breakthrough

20 February 2025

Step-Video-T2V: Text-to-Video Open-Source Model Achieves 16x Video Compression Breakthrough

Researchers from Stepfun AI have developed Step-Video-T2V, a 30-billion-parameter text-to-video model that generates videos up to 204 frames in length, 544×992 resolution, capable of understanding both Chinese and English prompts.…

Adobe Drops Industry-First Commercially Safe AI Video Model

13 February 2025

13 February 2025

Adobe Drops Industry-First Commercially Safe AI Video Model

13 February 2025

Adobe has launched its Firefly video model in public, marking the first generative AI tool built specifically for commercial use, addressing enterprise concerns about IP rights and legal safety in…

ByteDance Unveil TA-TiTok Tokenizer Achieving SOTA in Text-to-Image Generation Using Only Public Data

19 January 2025

ByteDance Unveil TA-TiTok Tokenizer Achieving SOTA in Text-to-Image Generation Using Only Public Data

Researchers from ByteDance and POSTECH introduced TA-TiTok (Text-Aware Transformer-based 1-Dimensional Tokenizer), a novel approach to making text-to-image AI models more accessible and efficient. Their work demonstrates through MaskGen models how…

Nvidia Drops Monster AI Update at CES 2025: Local Foundation Models Meet New RTX 50 Series

7 January 2025

7 January 2025

Nvidia Drops Monster AI Update at CES 2025: Local Foundation Models Meet New RTX 50 Series

7 January 2025

Nvidia just announced a major shift in consumer AI computing at CES 2025, combining new GPUs with a platform for running foundation models locally. The announcement includes next-gen RTX 50…

ArtAug: Open Source Framework for Image Generation Models Enhancement

18 December 2024

ArtAug: Open Source Framework for Image Generation Models Enhancement

East China Normal University and Alibaba Group researchers have introduced ArtAug, a framework that enhances text-to-image generation through synthesis-understanding interaction. This approach significantly improves image quality without requiring extensive manual…

Sora Turbo: OpenAI’s Enhanced Video Generation Model Goes Public

10 December 2024

SORA AI video generation model goes public

Sora Turbo: OpenAI’s Enhanced Video Generation Model Goes Public

OpenAI has announced the public release of Sora Turbo, a significantly enhanced version of their hyperrealistic text-to-video generation model. Announced during the company’s “12 Days of OpenAI” holiday series by…

Building an AI-Powered Game: A Deep Dive into DeepLearning.AI’s Latest Free Course Making AI Game Development Accessible

2 December 2024

deeplearning ai game development course free

Building an AI-Powered Game: A Deep Dive into DeepLearning.AI’s Latest Free Course Making AI Game Development Accessible

DeepLearning.AI’s newly released course, “Building an AI-Powered Game,” represents a significant step forward in making AI game development accessible to developers and enthusiasts alike. This comprehensive analysis explores how the…

FinRobot: Open Source Multi-Agent Framework for Automated Equity Research

16 November 2024

FinRobot: Open Source Multi-Agent Framework for Automated Equity Research

AI4Finance Foundation researchers have released FinRobot, the first AI agent framework specifically designed for equity research. Finrobot addresses key limitations of existing automated research tools by combining both quantitative and…

SynthID: DeepMind’s Open Source Approach for Generated Text Watermarking

31 October 2024

31 October 2024

synthID deepmind text generator watermark

SynthID: DeepMind’s Open Source Approach for Generated Text Watermarking

31 October 2024

DeepMind has released SynthID Text, expanding their established AI content authentication ecosystem to include text watermarking. This release, now available in Hugging Face Transformers v4.46.0+, follows DeepMind’s deployment of SynthID…

OpenAI Launches o1 Model Family, Introducing Advanced Reasoning

13 September 2024

OpenAI Launches o1 Model Family, Introducing Advanced Reasoning

OpenAI has introduced its new “o1” model family, marking a pivotal shift from the widely recognized GPT series. The o1 models — specifically o1-preview and o1-mini — are designed to…

Ideogram 2.0: Generating Text on Images with Unmatched Accuracy

22 August 2024

Ideogram 2.0: Generating Text on Images with Unmatched Accuracy

Ideogram launched its groundbreaking Ideogram 2.0 model, setting new standards in the text-to-image generation space. Trained from scratch, Ideogram 2.0 significantly outperforms existing models in key quality metrics such as…

How AI Helped King Studio Develop 13,755 Levels for Candy Crush Saga

4 July 2024

How AI Helped King Studio Develop 13,755 Levels for Candy Crush Saga

King, the developer behind the popular mobile game Candy Crush Saga, actively integrates artificial intelligence into the game’s development and optimization process. In a recent interview, Sahar Asadi from AI…

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

21 June 2024

Claude 3.5 Sonnet: State-of-the-Art LLM by Anthropic Overtakes GPT-4o in Major Benchmarks

Anthropic has introduced the new large language model Claude 3.5 Sonnet. It is now available on the ClaudeAI chatbot, Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5…

Dream Machine by Luma: New AI Video Generation Platform

14 June 2024

Dream Machine by Luma: New AI Video Generation Platform

Dream Machine by Luma AI is an advanced AI model designed to create high-quality, realistic, and fantastical videos from text instructions and images. Built on a scalable, efficient, and multimodal…

ElevenLabs’ AI Sound Generation Transforms Audio Production

1 June 2024

1 June 2024

ElevenLabs’ AI Sound Generation Transforms Audio Production

1 June 2024

In the ever-evolving landscape of digital content creation, ElevenLabs is making waves with its AI-driven sound generation tool. Designed to streamline the audio production process, this new technology allows users…

10 March 2024

искусственный интеллект разрабатывает лекарство

Startup Insilico Medicine Introduces First Drug Developed with Generative Models

10 March 2024

Startup Insilico Medicine has unveiled the first drug developed using generative models. This innovative approach to creation enabled the drug to pass its initial clinical trial phases in just two…