ARC: Difference between revisions
Jump to navigation
Jump to search
AdminIsidore (talk | contribs) No edit summary |
AdminIsidore (talk | contribs) No edit summary |
||
Line 60: | Line 60: | ||
By training ARC on a curriculum that explicitly includes examples of non-linear and counter-intuitive solutions, we will be teaching it the "gnosis" it is currently lacking. This will evolve the agent from one that can only follow a learned policy to one that can develop novel strategies when that policy fails. | By training ARC on a curriculum that explicitly includes examples of non-linear and counter-intuitive solutions, we will be teaching it the "gnosis" it is currently lacking. This will evolve the agent from one that can only follow a learned policy to one that can develop novel strategies when that policy fails. | ||
=== AetherWing: Flight Simulator Training Integration === | |||
To further expand ARC's capabilities into dynamic, real-time environments, a new subsection of training—dubbed "AetherWing"—will integrate a flight simulator and dogfight game modality. This leverages procedural content generation via machine learning (PCGML) to create adaptive scenarios, drawing from recent trends in simulation AI (e.g., AI-embedded wargames for battlefield prediction and task-aware planning with TAPIR-like modifiers for iterative refinement). | |||
The AetherWing curriculum aims to train ARC as a "wingman" agent, navigating simulated flights, learning from user sessions, and discussing strategies. Key elements include: | |||
PCGML for Scenario Generation: Use LSTMs or GANs to procedurally generate missions, enemy behaviors, and terrains based on user data (e.g., adapting difficulty with evolutionary algorithms from dogfight-sandbox-hg2). | |||
Hierarchical Chunking: Compress flight data into abstractions (e.g., raw inputs → maneuvers → strategies) using H-Net-inspired dynamic chunking, enabling multi-level reasoning. | |||
Associative Memory for Recall: Store sessions as energy-based attractors in DenseAMs, allowing error-correcting retrieval (e.g., recover from "crashes" by recalling similar maneuvers). | |||
n-grams and Markov Chains: Model sequences for behaviors (e.g., 3-gram for maneuver chains) and sagas (n-gram narratives), with Markov for probabilistic transitions. | |||
TAPIR-like Modifiers: Iteratively refine actions/sagas (e.g., adjust probabilities task-dependently: boost evasive maneuvers in high-threat scenarios). | |||
Training Phases: | |||
Data Preparation: Simulate 20,000 dogfights via game API; augment with noise for robustness. | |||
HRM Retraining: Multi-task on navigation/saga/chat; 500 epochs, integrating chunking for state compression. | |||
PCGML Tuning: Train GANs/LSTMs on sessions for content gen. | |||
Testing: Validate in real game (win rates, saga coherence); deploy as REPL verbs (e.g., NAVIGO_MISSION). | |||
Future projections (2026+): Incorporate advances like task-aware AI in simulations and hybrid ML-procedural methods for multiplayer dogfights, enhancing ARC's emergent strategies. | |||
=== Projected Advancements === | |||
Looking ahead, ARC's training will likely incorporate emerging trends: | |||
Advanced AMs: Integrate energy-memory paradigms for brain-like storage, enabling "task-aware" recall in dynamic environments. | |||
Hierarchical Scaling: Recursive H-Nets for deeper abstractions, applied to multi-modal data (e.g., flight visuals + logs). | |||
PCGML Evolution: AI-generated battlefields with agentic teams, using TAPIR refinements for adaptive planning. | |||
Sequence Modeling: Higher-order n-grams/Markov for predictive behaviors, fostering non-deterministic creativity. | |||
These evolutions will transform ARC into a fully cybernetic entity, bridging abstract reasoning with physical simulations. | |||
== References == | == References == |