Meta launches free AI-powered video editing tools Emu Video, Emu Edit
Meta Platforms on Thursday unveiled two new AI-based features for video editing named Emu Video and Emu Edit. These tools are designed to execute tasks based on text instructions and could be used for posting to Instagram or Facebook.
The first is called Emu Video and it generates four-second-long videos with a prompt of a caption, photo, or an image, paired with a description. The other is known as Emu Edit which allows users to more easily alter or edit videos with text prompts. The new tools are part of Meta’s new research into controlled image editing based solely on text instructions and a method for text-to-video generation based on diffusion models, the company said in a blog post.
Emu Video generates brief, four-second-long videos by combining a prompt with a caption, photo, or image description. On the other hand, Emu Edit allows users to easily modify or edit videos using text prompts. According to a blog post by Meta, these new tools are part of the company’s exploration into controlled image editing solely based on text instructions and a text-to-video generation method using diffusion models.
“Technology from Emu underpins many of our generative AI experiences, some AI image editing tools for Instagram that let you take a photo and change its visual style or background, and the Imagine feature within Meta AI that lets you generate photorealistic images directly in messages with that assistant or in group chats across our family of apps. Our work in this exciting field is ongoing, and today, we’re announcing new research into controlled image editing based solely on text instructions and a method for text-to-video generation based on diffusion models,” Meta said in a blog post.
Meta initially introduced Emu during the Meta Connect event in September, and it underpins various generative AI experiences. This includes AI image editing tools for Instagram, enabling users to change the visual style or background of a photo, as well as the Imagine feature within Meta AI for generating photorealistic images in messages and group chats across their apps.
The latest tools represent an advancement of the parent model Emu, which generates images in response to text prompts. Meta emphasized that Emu Video, utilizing the Emu model, simplifies text-to-video generation based on diffusion models, accepting inputs of text only, image only, or both text and image.
The company detailed a two-step process for the generation, first creating images based on a text prompt and then generating videos conditioned on both the text and the generated image. Unlike previous methods requiring multiple models, Meta’s approach is more straightforward, using only two diffusion models to produce 512×512 four-second videos at 16 frames per second.
Meta also introduced Emu Edit, which enables free-form editing through instructions, covering tasks such as local and global editing, background removal and addition, color and geometry transformations, as well as detection and segmentation.
Meta stressed that the primary goal is not just to produce a ‘believable’ image, but to precisely alter only the pixels relevant to the edit request. Emu Edit stands out by following instructions accurately, ensuring that pixels unrelated to the task remain untouched in the input image.
To train the model, Meta created a dataset with 10 million synthesized samples, each including an input image, a task description, and the targeted output image. The company believes this dataset is the largest of its kind to date.
While acknowledging that the work is currently fundamental research, Meta outlined potential use cases, such as creating personalized animated stickers or GIFs for chat, editing photos without technical skills, enhancing Instagram posts with animated static photos, or generating entirely new content.
The field of generative AI has rapidly evolved in recent years, transitioning from image to video generation. Meta, with existing large language models like AudioCraft, SeamlessM4T, and Llama 2, has been at the forefront of this evolution. Generative AI services, including Microsoft-backed OpenAI’s ChatGPT, have gained significant attention in the business world, with companies seeking newer capabilities and process refinement.
Meta’s focus on AI has intensified as it competes with other tech giants like Microsoft, Google, and Amazon in this rapidly advancing space. You can learn more about Emu Video and Emu Edit by visiting the Emu Video project and Emu Edit project web pages.