Minecraft not only holds the title of the best-selling video game in history but is also emerging as a crucial tool for developing adaptable artificial intelligence (AI) models capable of handling diverse tasks akin to human cognition.
Researchers, led by Steven James at the University of the Witwatersrand in South Africa, introduced MinePlanner, a benchmark test embedded within Minecraft, designed to evaluate the general intelligence of AI models. The MinePlanner assesses an AI’s capacity to sift through irrelevant details while solving intricate problems that involve multiple steps.
Unlike many AI training approaches that provide models with precisely tailored data for a specific task, MinePlanner challenges AI to grapple with messy, real-world problems. James emphasizes the importance of addressing messy issues for artificial general intelligence (AGI) development, highlighting that future AI models should navigate complexities similar to those encountered in the game.
MinePlanner comprises 15 construction problems, each categorized into easy, medium, and hard settings, resulting in 45 tasks. These tasks require the AI to take intermediate steps, such as constructing stairs to reach a specific block height. This demands the AI’s ability to systematically step back, plan, and solve the problem. In experimental trials involving state-of-the-art planning AI models like ENHSP and Fast Downward, designed for sequential operations toward a goal, neither model could tackle the complicated problems. Fast Downward solved only one medium problem and five easy problems, while ENHSP, performing slightly better, completed all but one easy problem and all but two medium problems.
James emphasizes the challenge of avoiding a scenario where a human designer dictates precisely what an AI should prioritize for every potential task. The goal is to develop AI models that independently discern what is relevant and can be ignored in problem-solving scenarios. MinePlanner aims to guide research toward creating more robust AI systems capable of autonomously addressing complex and messy challenges.