Revolution in artificial intelligence: Genie 3 can create virtual worlds

Genie 3 , still in the research phase and not yet publicly available, combines the capabilities of its predecessors, the Genie 2, and its video production model, the Veo 3. The new AI model can create minutes-long interactive 3D environments at 24 frames per second at 720p resolution. This is a significant leap forward from the 10-20 second scenes the Genie 2 could produce.

The most striking aspect of the model is its ability to recall images previously generated by Genie 3 and, based on this, logically decide how events will unfold. The company emphasizes that this consistency isn't artificially coded; it learns on its own. "We didn't train this model to mimic the real world, making decisions according to the laws of physics," says DeepMind researcher Shlomi Fruchter. "It learned this consistency on its own."

Genie 3 can change scenes based on user commands. For example, when an AI character in the middle of a warehouse is given a command like "approach the green trash compactor" or "walk toward the red forklift," Genie 3 creates the environment and the character takes action in this world. According to DeepMind's tests, these tasks were successfully completed.

However, Genie 3 has its limitations. For example, it was observed that the snow didn't react realistically in a scene where a skier was sliding downhill. Furthermore, the model is still inadequate to simulate the complex interactions of multiple independent characters, and simulations limited to a few minutes aren't sufficient for hours of training.

Still, experts say this technology brings AI one step closer to imitating human-specific behaviors such as planning, exploring its environment, navigating the unknown, and improving through experience.