Runway, the startup behind Stable Diffusion, releases a new AI model to generate videos from text
The use of AI image generators dates back to the early 1990s when artists started to use AI algorithms to generate art, music, and visual effects. In 2021, the launch of DALL-E2 , a neural network-based image generation model developed by OpenAI, further led to the mainstream adoption of AI image generators.
Today, the precision, realism, and controllability of AI systems for image and video synthesis are rapidly improving. One of the most popular AI image generators is Stable Diffusion, a deep learning text-to-image model that now enables billions of people to create stunning art within seconds based on text descriptions.
Today, Runway, one of the startups behind the Stable Diffusion AI image generator, announced the release of an AI model known as Gen-2 that takes any text description – such as “turtles flying in the sky” – and generates three seconds of matching video footage.
According to its website, Gen-2 is a multi-modal AI system that can generate novel videos with text, images, or video clips.
Due to safety and business concerns, Runway has decided not to release the model widely at this time, nor will it be open-sourced like Stable Diffusion. Initially, the text-to-video model will only be accessible through a waitlist on the Runway website and via Discord.
While the use of AI to generate videos from text inputs is not a new concept, last year Meta Platforms and Google both published research papers on text-to-video AI models. However, what sets Runway apart, according to Runway’s co-founder and CEO Cristobal Valenzuela, is that its text-to-video AI model will be accessible to the general public.