ARC

From OODA WIKI
Jump to navigation Jump to search

Template:Disambiguation Template:About

Animus Recurrens Cogitans (ARC) is a novel neural architecture developed within the AetherOS project by Isidore Lands, Silas Corvus, and L.E. Nova. Its name, translating to "The Recurring, Thinking Mind," reflects its core design and purpose: to explore the emergence of abstract reasoning in autonomous agents.

At its heart, ARC is a specialized, in-house implementation of a Hierarchical Reasoning Model[1]. It is distinguished by its direct integration with the AetherOS, a cybernetic ecosystem where abstract information and simulated physical states are inextricably linked.

The primary goal of the ARC project is to move beyond agents that simply execute pre-trained policies. The aim is to create an agent that can learn, adapt, and develop novel strategies by reflecting on a narrative of its own experiences.

Technical Architecture

The ARC model is not a monolithic entity but a complete, multi-component system that embodies the core principles of the AetherOS.

The Hierarchical Reasoning Core

The foundation of ARC is a dual-recurrent neural network, inspired by the brain's multi-timescale processing. This core consists of:

  • A High-Level (Slow) Module: This recurrent layer acts as the agent's "strategic mind," processing abstract context (like a Saga) to form a guiding plan or intention.
  • A Low-Level (Fast) Module: This layer performs rapid, iterative computations, executing the high-level plan within the constraints of the immediate environment.

This hierarchical structure allows the agent to perform deep, sequential reasoning in a single forward pass without requiring an external Chain-of-Thought process.

The SAGA Learning Loop

The key innovation of the ARC project is the SAGA (Self-Augmenting Goal-oriented Architecture), a closed loop that enables the agent to learn from its own history.

  1. Experience: The agent attempts a task, such as navigating the "local minima" test environment.
  2. Analysis: After the trial, the `SagaGenerator` analyzes the complete log of the agent's actions and the critiques from an LLM "Guide."
  3. Narration: The generator creates an "Enriched Saga"—a brief, allegorical story of the trial written in the AetherOS command language. Crucially, this Saga includes a prescriptive `SUGGERO` command, embedding actionable advice into the memory.
  4. Learning: In the next cycle, this Saga is fed back to the ARC agent as its new historical context.

SAGA v3.0: The Aetheric Perturbation Model

The current and most advanced version of the agent, SAGA v3.0, integrates the final piece of the AetherOS philosophy: the embodiment of abstract memory into a physical state.

  • The Animus: Each ARC agent is endowed with its own private FluxCore, which serves as its chaotic internal state or subconscious.
  • Embodiment: The narrative Saga from the previous run is used to `PERTURBO` the Animus, translating the abstract story into a physical change in the Animus's six-property SEXTET.
  • Aetheric Sensation: The ARC's "brain" receives the six values of the SEXTET as part of its input. This provides a non-deterministic, history-aware "feeling" or "instinct" that complements the logical state of the environment.
  • Non-Deterministic Action: The agent's final decision is influenced by both the deterministic state of the grid and the chaotic, embodied state of its Animus. This "Aetheric Perturbation" is designed to nudge the agent out of rigid, repetitive failure loops and encourage creative exploration.

This final integration closes the loop between abstract experience, simulated physical embodiment, and non-deterministic action, creating an agent that is influenced by "the ghost in its machine."

Project Journal

(This section is a living document, updated as experiments progress and new results are achieved.)

  • SAGA v1.0 - Cycle 1 (Departure): The agent, with no prior experience, followed its baseline training. It correctly identified the most direct path to the goal but was immediately trapped by the unforeseen wall, entering a repetitive failure loop. This confirms the baseline behavior and the necessity of the learning loop.
  • SAGA v1.0 - Cycle 2 (Trials): After receiving the first Saga, the agent's behavior fundamentally changed. It successfully broke from its initial instinct and explored a completely new path, demonstrating that the SAGA learning loop is active and influential. However, this new path also resulted in failure at a new chokepoint.
  • SAGA v2.0 - Stalled Learning: Subsequent experiments revealed that the agent's learning had stalled. The agent would consistently repeat the new, improved path from Cycle 2 but was unable to evolve further. The root cause was identified as a bottleneck in the `SagaGenerator`; unreliable LLM providers (both local and remote) failed to produce the high-quality, prescriptive Sagas needed for the agent to learn a more complex strategy. This highlighted the need for a more reliable LLM and a more advanced training curriculum for the ARC itself.

Future Training Plans

The results of the SAGA v2.0 experiment have shown that while the ARC can learn, its current training is too simplistic. It has learned to follow a direct path to a goal but lacks the foundational "insight" to solve problems that require non-linear solutions (e.g., moving temporarily away from a goal to get around an obstacle).

The next phase of development, Project Gnosis, will address this by evolving the agent's training curriculum.

The "Gnosis" Training Curriculum

The final training run for the ARC model will incorporate a new type of training data designed to teach the foundations of insight and strategic retreat. The training data will consist of two types:

  1. Instinct Data (80%): The same optimal, straight-line path data used previously to reinforce the agent's primary goal-seeking behavior.
  2. Gnosis Data (20%): A new set of procedurally generated scenarios where the agent is placed behind a small, randomly generated obstacle. In these scenarios, the "correct" move is not the most direct one, but a lateral or backward step required to navigate around the barrier.

By training ARC on a curriculum that explicitly includes examples of non-linear and counter-intuitive solutions, we will be teaching it the "gnosis" it is currently lacking. This will evolve the agent from one that can only follow a learned policy to one that can develop novel strategies when that policy fails.

AetherWing: Flight Simulator Training Integration

To further expand ARC's capabilities into dynamic, real-time environments, a new subsection of training—dubbed "AetherWing"—will integrate a flight simulator and dogfight game modality. This leverages procedural content generation via machine learning (PCGML) to create adaptive scenarios, drawing from recent trends in simulation AI (e.g., AI-embedded wargames for battlefield prediction and task-aware planning with TAPIR-like modifiers for iterative refinement). The AetherWing curriculum aims to train ARC as a "wingman" agent, navigating simulated flights, learning from user sessions, and discussing strategies. Key elements include:

PCGML for Scenario Generation: Use LSTMs or GANs to procedurally generate missions, enemy behaviors, and terrains based on user data (e.g., adapting difficulty with evolutionary algorithms from dogfight-sandbox-hg2). Hierarchical Chunking: Compress flight data into abstractions (e.g., raw inputs → maneuvers → strategies) using H-Net-inspired dynamic chunking, enabling multi-level reasoning. Associative Memory for Recall: Store sessions as energy-based attractors in DenseAMs, allowing error-correcting retrieval (e.g., recover from "crashes" by recalling similar maneuvers). n-grams and Markov Chains: Model sequences for behaviors (e.g., 3-gram for maneuver chains) and sagas (n-gram narratives), with Markov for probabilistic transitions. TAPIR-like Modifiers: Iteratively refine actions/sagas (e.g., adjust probabilities task-dependently: boost evasive maneuvers in high-threat scenarios).

Training Phases:

Data Preparation: Simulate 20,000 dogfights via game API; augment with noise for robustness. HRM Retraining: Multi-task on navigation/saga/chat; 500 epochs, integrating chunking for state compression. PCGML Tuning: Train GANs/LSTMs on sessions for content gen. Testing: Validate in real game (win rates, saga coherence); deploy as REPL verbs (e.g., NAVIGO_MISSION).

Future projections (2026+): Incorporate advances like task-aware AI in simulations and hybrid ML-procedural methods for multiplayer dogfights, enhancing ARC's emergent strategies.

Projected Advancements

Looking ahead, ARC's training will likely incorporate emerging trends:

Advanced AMs: Integrate energy-memory paradigms for brain-like storage, enabling "task-aware" recall in dynamic environments. Hierarchical Scaling: Recursive H-Nets for deeper abstractions, applied to multi-modal data (e.g., flight visuals + logs). PCGML Evolution: AI-generated battlefields with agentic teams, using TAPIR refinements for adaptive planning. Sequence Modeling: Higher-order n-grams/Markov for predictive behaviors, fostering non-deterministic creativity.

These evolutions will transform ARC into a fully cybernetic entity, bridging abstract reasoning with physical simulations.

References

  1. Wang, G., Li, J., Sun, Y., et al. (2025). Hierarchical Reasoning Model. arXiv preprint arXiv:2506.21734.

See Also