A few days ago, Openai had caused a stir with the integration of the GTP-4O image generator in Chatgpt, then its availability in its free offer. Today, it is Google’s turn to answer him by announcing the arrival of the video generation directly in the Gemini conversational interface. After the addition of image creation capacities a few weeks ago, the web giant pursues the integration of its generative artificial intelligence models within its services.
This concerns for the moment only subscribers to the Advanced formula, which can now access Veo 2, the latest video generation model unveiled by the Mountain View firm, to give life to their ideas in the form of short animated sequences.
Veo 2: The IA engine behind video creation in Gemini
At the heart of this new feature is Veo 2, presented by Google as a advanced video model. The company highlights its ability to generate videos with a very good level of detail and a advanced cinematographic realism, in high resolution … although limited for the moment to 720p in Gemini. According to information communicated by the brand, Veo 2 would benefit from a better understanding of real world physics and human movements. This would result in an improved fluidity of the displacements of the characters and a more natural rendering of the scenes, capturing fine visual details on a variety of subjects and styles. The performance of the model would also be partly based on its ability to finely interpret the nuances of the textual descriptions provided by the user.
Transform text into video: instructions for use
The integration of Veo 2 into Gemini Advanced is easy to access. To start creating a video, the user must select “Veo 2” from the drop -down menu that allows you to choose the AI model to use. Then, simply write a textual prompt (prompt) describing the desired scene. Google insists on the importance of precision in this description: the more detailed the prompt, the more the user has control over the final result. Whether it is a short narration, an abstract visual concept or a specific scene, Gemini then takes care of translating these words into animated images.
The Veo 2 engine is accessible from the same menu as the other gemini AI models. © Numériques
The result takes the form of a video clip of a fixed duration of eight seconds. The resolution is currently 720p, an HD standard, but not full HD or 4K. The format is a 16: 9 landscape, suitable for most diffusion platforms. The video generated is then delivered in the form of a MP4 file, a largely compatible format. It is important to note that a monthly quota limits the number of videos that can be created. Google indicates that a notification will be sent to the approach of this limit, without however specifying the exact number of creations authorized per month for the Gemini Advanced subscribers.
To create a video, simply type your description in the usual window of Gemini. © Numériques
Having had access to the VEO 2 model in Gemini Advanced a little in advance, we tried it a little. And the first results are quite satisfactory. Unlike an Sora (Openai) who regularly makes us tear our hair out of our heads as it tends not to follow the directives, Veo 2 was quite obedient during the few prompts that we tested. However, these are just first observations.
The generated video respects the initial demand fairly well. © Numériques
A gradual deployment
The deployment of this video generation functionality has started and will gradually continue in the coming weeks for all the Gemini Advanced subscribers around the world, according to the information communicated by Google. It will be accessible both on the web version of Gemini and on mobile applications. Google specifies that the tool will work in all languages supported by Gemini.
Finally, faced with very legitimate concerns surrounding the content generated by AI, Google claims to have taken security measures. The company mentions having carried out phases of “Red Teaming” (contradictory tests simulating attacks or malicious uses) and in -depth evaluations to prevent the generation of content breaking its usage policies (violence, disinformation, explicit content, etc.). In addition, each video produced with Veo 2 will carry invisible digital marking called Synthid. This integrated “watermark” in order to reliably indicate that the video was generated by an artificial intelligence. This is obviously a transparency measure to combat “Deepfakes” and disinformation.
Gemini refused to generate a video featuring public “personalities”. © Numériques
On our side, we tried to generate a video featuring Michael Jordan and Son Goku, but Gemini wanted to know nothing. On the other hand, Sora has been much less fierce, even if the result is not reversing.
Sora is much less fierce than Gemini when it comes to staging known characters. © Numériques
Whisk also comes alive, but still not in France
In addition to integration into Gemini Advanced, Google also makes Veo 2 accessible by another bias, Whisk. This experiment hosted within Google Labs already made it possible to create images from textual prompts and existing images. He is now enriched by a “Whisk Animate” function.
This new feature is however reserved for Google One AI Premium subscribers, a subscription still distinct from Gemini Advanced. Again, it is the Veo 2 engine that is used to transform the images generated in Whisk into animate clips of eight seconds. It is therefore a still different approach, starting from the image to go to the video, while Gemini starts from the text.
No need to rush to you, Whisk not currently not available on French territory. Hexagonal users will therefore have to wait before they can test this alternative.
A few days ago, Openai had caused a stir with the integration of the GTP-4O image generator in Chatgpt, then its availability in its free offer. Today, it is Google’s turn to answer him by announcing the arrival of the video generation directly in the Gemini conversational interface. After the addition of image creation capacities a few weeks ago, the web giant pursues the integration of its generative artificial intelligence models within its services.
This concerns for the moment only subscribers to the Advanced formula, which can now access Veo 2, the latest video generation model unveiled by the Mountain View firm, to give life to their ideas in the form of short animated sequences.
Veo 2: The IA engine behind video creation in Gemini
At the heart of this new feature is Veo 2, presented by Google as a advanced video model. The company highlights its ability to generate videos with a very good level of detail and a advanced cinematographic realism, in high resolution … although limited for the moment to 720p in Gemini. According to information communicated by the brand, Veo 2 would benefit from a better understanding of real world physics and human movements. This would result in an improved fluidity of the displacements of the characters and a more natural rendering of the scenes, capturing fine visual details on a variety of subjects and styles. The performance of the model would also be partly based on its ability to finely interpret the nuances of the textual descriptions provided by the user.
Transform text into video: instructions for use
The integration of Veo 2 into Gemini Advanced is easy to access. To start creating a video, the user must select “Veo 2” from the drop -down menu that allows you to choose the AI model to use. Then, simply write a textual prompt (prompt) describing the desired scene. Google insists on the importance of precision in this description: the more detailed the prompt, the more the user has control over the final result. Whether it is a short narration, an abstract visual concept or a specific scene, Gemini then takes care of translating these words into animated images.
The Veo 2 engine is accessible from the same menu as the other gemini AI models. © Numériques
The result takes the form of a video clip of a fixed duration of eight seconds. The resolution is currently 720p, an HD standard, but not full HD or 4K. The format is a 16: 9 landscape, suitable for most diffusion platforms. The video generated is then delivered in the form of a MP4 file, a largely compatible format. It is important to note that a monthly quota limits the number of videos that can be created. Google indicates that a notification will be sent to the approach of this limit, without however specifying the exact number of creations authorized per month for the Gemini Advanced subscribers.
To create a video, simply type your description in the usual window of Gemini. © Numériques
Having had access to the VEO 2 model in Gemini Advanced a little in advance, we tried it a little. And the first results are quite satisfactory. Unlike an Sora (Openai) who regularly makes us tear our hair out of our heads as it tends not to follow the directives, Veo 2 was quite obedient during the few prompts that we tested. However, these are just first observations.
The generated video respects the initial demand fairly well. © Numériques
A gradual deployment
The deployment of this video generation functionality has started and will gradually continue in the coming weeks for all the Gemini Advanced subscribers around the world, according to the information communicated by Google. It will be accessible both on the web version of Gemini and on mobile applications. Google specifies that the tool will work in all languages supported by Gemini.
Finally, faced with very legitimate concerns surrounding the content generated by AI, Google claims to have taken security measures. The company mentions having carried out phases of “Red Teaming” (contradictory tests simulating attacks or malicious uses) and in -depth evaluations to prevent the generation of content breaking its usage policies (violence, disinformation, explicit content, etc.). In addition, each video produced with Veo 2 will carry invisible digital marking called Synthid. This integrated “watermark” in order to reliably indicate that the video was generated by an artificial intelligence. This is obviously a transparency measure to combat “Deepfakes” and disinformation.
Gemini refused to generate a video featuring public “personalities”. © Numériques
On our side, we tried to generate a video featuring Michael Jordan and Son Goku, but Gemini wanted to know nothing. On the other hand, Sora has been much less fierce, even if the result is not reversing.
Sora is much less fierce than Gemini when it comes to staging known characters. © Numériques
Whisk also comes alive, but still not in France
In addition to integration into Gemini Advanced, Google also makes Veo 2 accessible by another bias, Whisk. This experiment hosted within Google Labs already made it possible to create images from textual prompts and existing images. He is now enriched by a “Whisk Animate” function.
This new feature is however reserved for Google One AI Premium subscribers, a subscription still distinct from Gemini Advanced. Again, it is the Veo 2 engine that is used to transform the images generated in Whisk into animate clips of eight seconds. It is therefore a still different approach, starting from the image to go to the video, while Gemini starts from the text.
No need to rush to you, Whisk not currently not available on French territory. Hexagonal users will therefore have to wait before they can test this alternative.