Google DeepMind has introduced Gemini Omni Flash, a video creation model that allows users to generate high-quality videos from various inputs such as images, audio, video, and text. The model, part of the Omni family, enables users to edit videos through natural language instructions, maintaining consistency in characters, physics, and scene continuity. According to DeepMind, Gemini Omni Flash is the first model in the Omni family to be rolled out, with future support for image and audio outputs. The model's ability to reason about real-world knowledge and physics allows it to create more realistic scenes, as well as blend knowledge with creativity to produce compelling visual explanations. Users can also reference images, text, video, or audio to generate cohesive outputs, with plans to expand audio input capabilities. *Source: [deepmind](https://deepmind.google/blog/introducing-gemini-omni/)*