
- The perception module. This section takes raw sensory inputs such as images, video and proprioception and encodes them into a compact latent representation of the environment.
- The prediction module. This is a dynamics model which handles probability distribution and captures causality and temporal structure. It probabilistically predicts the next latent state and the expected results of any actions.
- The planning (control) module. This module uses the output of the prediction model to simulate future trajectories and select actions that optimize achievements towards a goal.
“At its core, a world model is an internal representation that an AI system constructs to simulate the external environment. By continuously processing sensory data, a robot builds a dynamic blueprint of its surroundings,” explains Aurorain founder Luhui Hu. “This fusion of perception, prediction and planning mirrors cognitive processes in humans, setting the stage for more advanced robotic behavior.”
World models open up immense possibilities
There seem to be almost no limits to the potential waiting within world models, even if we set aside AGI aspirations for the moment. Here are just a few of the many ways world models could impact our lives.
Immersive visual experiences
With world models, it is finally becoming possible to build convincing worlds that you can interact with and experience. These are the very first capabilities that are coming on line, thanks to models like those developed by Decart, which can even be used as playable, game engine-free simulations.

