Model Release
Google DeepMind Launches Gemini Omni Flash for Video Creation
Google DeepMind has launched Gemini Omni Flash, a new video creation model, available in the Gemini app, Google Flow, and YouTube Shorts.
Image: Google DeepMind
Google DeepMind has introduced Gemini Omni Flash, a video creation model that allows users to generate high-quality videos from various inputs such as images, audio, video, and text. The model, part of the Omni family, enables users to edit videos through natural language instructions, maintaining consistency in characters, physics, and scene continuity. According to DeepMind, Gemini Omni Flash is the first model in the Omni family to be rolled out, with future support for image and audio outputs. The model's ability to reason about real-world knowledge and physics allows it to create more realistic scenes, as well as blend knowledge with creativity to produce compelling visual explanations. Users can also reference images, text, video, or audio to generate cohesive outputs, with plans to expand audio input capabilities. *Source: [deepmind](https://deepmind.google/blog/introducing-gemini-omni/)*
Key points
- Google DeepMind introduced Gemini Omni Flash, a video creation model available in the Gemini app, Google Flow, and YouTube Shorts.
- Gemini Omni Flash allows users to generate high-quality videos grounded in real-world knowledge from various input types.
- Users can edit videos through natural language instructions, maintaining scene continuity and physical consistency.
- The model combines Gemini's understanding of physics, history, science, and cultural context to create realistic scenes.
- Gemini Omni Flash can generate compelling explainers from short prompts, breaking down complex ideas visually.
- The model can reference images, text, video, or audio to create cohesive outputs, with plans to expand audio input capabilities.
- All videos created with Gemini Omni include an imperceptible SynthID digital watermark for content verification.