OPINE-World: Programmatic World Modeling with Ontology-error-Prioritized Interactive Exploration

Researchers have introduced OPINE-World, a breakthrough LLM agent that learns an object-centric programmatic world model online through interaction. By guiding exploration with a novel 'ontology error' metric, it overcomes the data-hungry nature of traditional deep networks and achieves high efficiency on the ARC-AGI-3 benchmark.

Computer Science > Artificial Intelligence

Title:OPINE-World: Programmatic World Modeling with Ontology-error-Prioritized Interactive Exploration

View PDF HTML (experimental)Abstract:Learning how an environment behaves from interaction is central to building agents that adapt to unfamiliar tasks. World models learned with deep networks are flexible but data-hungry and transfer poorly beyond their training distribution. Program-synthesized world models, written as source code by LLMs and refined through counterexample-guided inductive synthesis (CEGIS), are instead data-efficient and reusable, yet they have been demonstrated mainly on structured-state worlds with a given object vocabulary, and a single program search does not scale to pixel-rendered environments whose object structure must be hypothesized flexibly. We introduce OPINE-World, an LLM agent that learns an object-centric programmatic world model online from interaction. OPINE-World couples two cooperating agents in a loop of hypothesis and test, one acting in the environment and one synthesizing the model in code with replay verification and model-based planning, and it steers exploration with a Bayesian measure of object-type adequacy we call ontology error. We evaluate OPINE-World on ARC-AGI-3, a benchmark for skill-acquisition efficiency in which the object vocabulary, the goal, and the action semantics are withheld. OPINE-World solves 20 of 25 games without per-game training and reaches an action-efficiency score of 78.4 against the human baseline.

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Source: arXiv cs.AI Recent

OPINE-World: Programmatic World Modeling with Ontology-error-Prioritized Interactive Exploration

Computer Science > Artificial Intelligence

Title:OPINE-World: Programmatic World Modeling with Ontology-error-Prioritized Interactive Exploration

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

More in this category

Beyond Next-Token Prediction: An RLVR Proof of Concept for Tool-Use Agents on Atlassian Workflows

EO-Agents: A Three-Agent LLM Pipeline for Earth Observation Hypothesis Generation

Agent4cs: A Multi-agent System for Code Summarization in Large Hierarchical Codebases

Scaling Trends for Lie Detector Oversight in Preference Learning

The Agentic Garden of Forking Paths

World Feedback for Clinical Agents: Diagnosing RL in FHIR Environments

Discover All Categories