Microsoft VASA-1: Platform for Animating Static Images

21 April 2024
microsoft vasa-1

Microsoft VASA-1: Platform for Animating Static Images

Microsoft has introduced the VASA-1 platform, which transforms a person’s image and audio recording with speech into a video with synchronized lip and head movements. The algorithm operates in real-time…

Google RecurrentGemma: Next-Gen Local Language Model

14 April 2024
recurrentgemma пщщпду

Google RecurrentGemma: Next-Gen Local Language Model

Google has introduced the RecurrentGemma language model, designed to operate locally on devices with limited resources such as smartphones, personal computers, and smart speakers. The new architecture from Google significantly…

Gretel: The Largest Open Text-to-SQL Dataset

7 April 2024
gretel dataset sql

Gretel: The Largest Open Text-to-SQL Dataset

Gretel, a startup specializing in generating high-quality synthetic data, has announced the creation of the largest open text-to-SQL dataset aimed at accelerating the development of no-code analytics tools. The dataset…

Voice Engine: OpenAI’s Voice Synthesis Model

1 April 2024
voice engine openai

Voice Engine: OpenAI’s Voice Synthesis Model

OpenAI has unveiled Voice Engine, a model capable of voice cloning from a 15-second audio recording. Among the users of the model, the company mentions podcasters, announcers, audiobook authors, advertisers,…

SCIN: Dataset of Dermatological Disease Images

25 March 2024
google scin dataset

SCIN: Dataset of Dermatological Disease Images

Google, in collaboration with Stanford Medicine, has introduced SCIN – an open dataset comprising 10,000 images of dermatological diseases. Models trained on this dataset will be able to remotely diagnose…

Midjourney Introduces Character Transfer Feature to New Images

17 March 2024
перенос персонажа

Midjourney Introduces Character Transfer Feature to New Images

The image generation service Midjourney now offers a character transfer feature to new images by specifying a link to an existing image with the character in the request. This functionality…

Startup Insilico Medicine Introduces First Drug Developed with Generative Models

10 March 2024
искусственный интеллект разрабатывает лекарство

Startup Insilico Medicine Introduces First Drug Developed with Generative Models

Startup Insilico Medicine has unveiled the first drug developed using generative models. This innovative approach to creation enabled the drug to pass its initial clinical trial phases in just two…

Microsoft ViSNet: Predicting Molecule Activity

3 March 2024
microsoft visnet

Microsoft ViSNet: Predicting Molecule Activity

Microsoft has unveiled ViSNet – a graph neural network modeling the geometry of complex molecules to predict their activity. ViSNet has the potential to significantly expedite the search for and…

Tableau Pulse: Personalized Dashboard Summarization

25 February 2024
Tableau Pulse

Tableau Pulse: Personalized Dashboard Summarization

Salesforce has introduced Tableau Pulse – a platform that generates a personalized feed with key metric changes based on a Tableau user company’s dashboards. Tableau Pulse utilizes natural language queries…

Sora: OpenAI’s Groundbreaking Text-to-Image Diffusion Model

18 February 2024
openai sora

Sora: OpenAI’s Groundbreaking Text-to-Image Diffusion Model

OpenAI has unveiled Sora, a diffusion-based text-to-image model capable of generating 60-second videos. Compared to competitors like Runway, Pika, Stability AI, and Google, OpenAI’s model boasts high-resolution (Full HD) output,…

Apple MGIE: Multimodal Models for Image Editing

12 February 2024
apple mgie

Apple MGIE: Multimodal Models for Image Editing

Apple, in collaboration with the University of California, has developed the open-source MGIE model for image editing based on text input. This model tackles various editing tasks, including Photoshop-style image…

Google MobileDiffusion: Generating Images on Mobile Devices

4 February 2024
MobileDiffusion

Google MobileDiffusion: Generating Images on Mobile Devices

Google has introduced MobileDiffusion, a real-time text-to-image generation model that operates entirely on mobile devices. On Android and iOS devices with the latest generation processors, image generation at a resolution…

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

21 January 2024
AlphaGeometry

DeepMind Trains AlphaGeometry Model to Solve Olympiad Geometry Problems

DeepMind has unveiled AlphaGeometry – a model capable of solving geometric problems at the level of International Mathematical Olympiad winners. AlphaGeometry solved 25 out of 30 Olympiad problems, while on…

Microsoft DragNUWA: Video Generation via Object Trajectories

15 January 2024

Microsoft DragNUWA: Video Generation via Object Trajectories

Microsoft has released the DragNUWA weights – a cross-domain video generation model that offers more precise control over the resulting output compared to similar models. Control is achieved by simultaneously…

Pika 1.0: A Web Platform for Video Generation

7 January 2024
pika

Pika 1.0: A Web Platform for Video Generation

Pika Labs startup has launched Pika 1.0 – a free web platform for generating and editing videos using text-based queries. The service creates both realistic videos and 3D animation in…

Diffusion Model Trained to Predict Chemical Reactions

27 December 2023
mit duffusion model

Diffusion Model Trained to Predict Chemical Reactions

MIT scientists have developed a model that predicts the likelihood of a molecule reaching a transition state—critical for determining the probability of a chemical reaction. Furthermore, researchers will use the…

VideoPoet: Google’s Language Model for Video Generation and Editing

23 December 2023
videopoet

VideoPoet: Google’s Language Model for Video Generation and Editing

Google has unveiled VideoPoet, a language model for multimodal video content processing capable of turning text and images into clips, styling pre-existing videos, and generating soundtrack for them without any…

Google MusicFX: Transform Text Into Unique Soundscapes with AI

17 December 2023

Google MusicFX: Transform Text Into Unique Soundscapes with AI

Google has launched MusicFX, an online service that generates music based on text queries. Furthermore, the product utilizes Google’s MusicLM model, and each audio file contains an inaudible watermark created…

FractalGPT Launches Question-Answer Agent for Handling Loaded Documents

14 December 2023
fractalgpt

FractalGPT Launches Question-Answer Agent for Handling Loaded Documents

Developers at FractalGPT have rolled out a QA agent FractalGPT designed for interacting with documents, allowing users to engage in dialogues using uploaded PDF, TXT, and DOCX files. Key Features…

Shopping Muse: Mastercard’s Recommender System

10 December 2023

Shopping Muse: Mastercard’s Recommender System

Mastercard has unveiled Shopping Muse, a chatbot-format module for online stores that recommends products to shoppers based on their purchase and search history, region, and other factors. Operating on the…

Google Introduces Gemini, a Cutting-Edge Language Model Set

7 December 2023

Google Introduces Gemini, a Cutting-Edge Language Model Set

Google has announced the creation of Gemini, a set of three language models surpassing competitors in 30 out of 32 benchmarks. The top-tier model, Gemini Ultra, is available through an…