What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
🔘 Real-time capabilities Genie 3 is our first world model to allow live interaction, while also improving consistency and realism compared to Genie 2. It can generate dynamic worlds at 720p and 24 FPS, with each frame created in response to user actions.
🔘 Long-horizon consistency Environments created remain largely consistent over several minutes, with visual memory extending as far as 1️⃣ minute in the past. This ability is critical to enable AI agents to learn about the world, and provides humans with an immersive experience.
🔘 Promptable world events Beyond navigation, users can insert text prompts to alter the world in real-time - like changing the weather ⛅ or introducing new characters 👤 This unlocks a new level of dynamic interaction.
🔘 Accelerating agent research To explore the potential for agent training, we placed our SIMA agent in a Genie 3 world with a goal. The agent acts, and Genie 3 simulates a response in the world without knowing the objective. This is key for building more capable embodied agents.💡
🔘 Real-world applications Genie 3 offers a glimpse into new forms of entertaining or educational generative media. Imagine seeing life through the eyes of a dinosaur 🦖 exploring the streets of ancient Greece 🏛 or learning about how search and rescue efforts are planned. 🚁
World models are a key stepping stone on the path to AGI, promising unlimited rich simulations for training AI agents. Genie 3 represents a significant leap forward in making this a reality. We’re providing early access to a small cohort of academics and creators, while exploring how we can make it available to more trusted testers in the future. →
33,89K