Meta Connect introduced Emu, a foundational image generation model that underpins various generative AI experiences, including AI image editing tools for Instagram and the imagine feature in Meta AI. The new models are based solely on text instructions and a text-to-video generation method.
Emu Video, leveraging the Emu model, offers a unified architecture for text-to-video generation based on diffusion models. This approach generates images based on a text prompt and creates videos conditioned on text and rendered images, allowing efficient training of video generation models. The simplicity of using just two diffusion models outperforms prior work, generating 512