Today, Meta announced Make-A-Video, an AI-powered video generator that can create novel video content from text or image prompts, similar to existing image synthesis tools like DALL-E and Stable Diffusion. It can also make variations of existing videos, though it's not yet available for public use.
On Make-A-Video's announcement page, Meta shows example videos generated from text, including "a young couple walking in heavy rain" and "a teddy bear painting a portrait." It also showcases Make-A-Video's ability to take a static source image and animate it. For example, a still photo of a sea turtle, once processed through the AI model, can appear to be swimming.
The key technology behind Make-A-Video—and why it has arrived sooner than some experts anticipated—is that it builds off existing work with text-to-image synthesis used with image generators like OpenAI's DALL-E. In July, Meta announced its own text-to-image AI model called Make-A-Scene.