Google DeepMind Unveils Breakthrough Generative Media Models and Tools
Google DeepMind has announced significant advancements in generative media with the introduction of Veo 3 and Imagen 4, along with a new AI filmmaking tool called Flow. These innovations are set to revolutionize the creative industry by providing artists and creators with powerful tools to bring their visions to life.
Veo 3: State-of-the-Art Video Generation

Veo 3 represents a major leap forward in video generation technology. Not only does it improve upon the quality of its predecessor, Veo 2, but it also introduces the capability to generate videos with synchronized audio. This includes realistic sound effects like traffic noise in city scenes or birds singing in parks, and even dialogue between characters. The model excels in understanding complex prompts, allowing users to tell short stories that are then brought to life in video form. Veo 3 is now available for Ultra subscribers in the Gemini app and for enterprise users on Vertex AI.
Advancements in Veo 2
Building on the success of Veo 2, Google DeepMind has incorporated feedback from creators and filmmakers to add new capabilities. These include:
- Reference-powered video: Allows users to provide images of characters, scenes, or objects for better creative control.
- Camera controls: Enables precise camera movements such as rotations, dollies, and zooms.
- Outpainting: Broadens the video frame from portrait to landscape, intelligently adding to the scene.
- Object add and remove: Allows users to add or erase objects from videos while maintaining realistic scale, interactions, and shadows.
These features are now available in Flow, with plans to integrate them into the Vertex AI API and other products in the coming months.
Flow: AI Filmmaking Tool

Flow is an innovative AI filmmaking tool designed to work seamlessly with Veo, Imagen, and Gemini models. It allows creators to describe their shots using natural language, manage story elements in one place, and weave their narrative into beautiful scenes. Flow is currently available for Google AI Pro and Ultra plan subscribers in the U.S., with expansion plans to other countries.
Imagen 4: Enhanced Image Generation
Imagen 4 sets a new standard in image generation with its stunning quality and superior typography. It can create images in various aspect ratios and up to 2k resolution, making it ideal for printing and presentations. The model excels in both photorealistic and abstract styles and is significantly better at spelling and typography. Imagen 4 is available in the Gemini app, Whisk, Vertex AI, and across various Workspace applications.
Lyria 2: Advancing Music Creation
Lyria 2 brings powerful composition capabilities and endless exploration to music creation. It is now available for creators through YouTube Shorts and for enterprises in Vertex AI. Additionally, Lyria RealTime, which powers MusicFX DJ, allows for interactive music generation and control in real-time.
Responsible AI Innovation
Google DeepMind is committed to responsible AI development, collaborating closely with the creative community to ensure their tools empower artists while minimizing potential misuses. All outputs from Veo 3, Imagen 4, and Lyria 2 are watermarked using SynthID. The newly launched SynthID Detector helps identify AI-generated content, further promoting transparency and trust in AI-generated media.
These advancements mark a significant step forward in AI-driven creative tools, offering unprecedented possibilities for artists, filmmakers, musicians, and creators across various industries.