High level functionality of the Agent

The agent has to be able to see WHAT it is interacting with, WHO and WHERE. As a player it is important for me to see:

my own nation statistics (cities, factories, ports, defense posts, missile silo, sam)
buyables such as warship, atom bomb, hydrogen bomb, mirv.
troop count
max troops
troop gain
gold amount
owned land in percentage of the total land mass

for combat

adjustable attack ratio of the current troop count
outgoing attacks
incoming attacks from others
i want to be able to attack both neutral land and a neighbor
i want to be able to send transports ships to a specific tile

spatially

i want to know the distance to all nations via manhattan distance. This distance would be between the closest border of my nation to the closest border of all other nations
i want to be able to see neutral land that might be after a crossing/river and have the ability to send troops there.

diplomacy

i want the capability to send alliance requests
cancel alliances
see betrayal status for me and others as it gives a temporary debuff

=== Now I have a complete picture. Let me provide a refined analysis based on the actual game mechanics.

Refined Analysis Based on OpenFront Docs

The Game's Core Decision Loop

Based on the documentation, a skilled player needs to reason about:

GROWTH OPTIMIZATION
- Am I at ~42% of max troops? (optimal growth rate)
- Should I build cities to increase max pop?
- How much can I commit to attacks without crippling growth?
TARGET SELECTION
- Who is my NEIGHBOR? (border distance = 0)
- Who is WEAK? (low troops/tile = low shield)
- Who is THREATENING? (silos + attacking me)
- Who can I ANNEX? (surrounded territory)
COMBAT COMMITMENT
- Attack ratio: 50% early game, 20% late game
- Committed troops don't defend OR grow
DIPLOMACY
- Alliance = 5 min protection + trade benefits
- Betrayal = 30s GLOBAL debuff (everyone attacks you easier)
- Can exploit enemy betrayals
ECONOMY
- Gold from: passive (1000/s), conquest, trade, factories
- Trade = f(port distance), benefits both parties
- Build order: ports → cities → defense → silos

Critical Missing Observations

Feature	Why Critical	Status
border_distance	"Attack nearby weak enemies" is fundamental strategy	❌ Missing
max_troops	Growth optimization at 42% requires knowing max	❌ Missing
troop_gain_rate	Know if attacking will cripple growth	❌ Missing
alliance_state	Per-enemy: allied/neutral/enemy	❌ Missing
betrayal_timer	30s window to exploit or avoid	❌ Missing
entity_type	Bot (weak) vs Nation (builds) vs Human	❌ Missing
troops_committed	Attack ratio currently in use	❌ Missing
shield_value	troops/tiles = defense density	❌ Missing
can_afford_X	Know what's purchasable	❌ Missing
ports/factories	Enemy economy indicators	❌ Missing

Why CNN Is Wrong For This Game

From the docs, the key strategic questions are:

Question	Requires	CNN Provides
"Who should I attack?"	Entity relationships, distance	Pixel patterns
"Is this enemy weak?"	troops/tiles ratio	Nothing
"Can I annex them?"	Surrounded + no water escape	Edge detection?
"Should I ally?"	Threat assessment, shared borders	Nothing
"Am I growing optimally?"	troops/max_troops ratio	Nothing

The spatial grid (64×64×3) shows:

Terrain (water/shore/land)
Ownership (mine/enemy/neutral) - but not WHICH enemy
Fallout

It does NOT show:

Which tiles belong to which specific enemy
Distance between my border and enemy borders
Annexation opportunities
Attack/transport paths

State-of-the-Art Architecture for OpenFront

Based on the game mechanics, here's what the model actually needs:

┌────────────────────────────────────────────────────────────────┐ │ ENTITY-RELATIONSHIP ARCHITECTURE │ ├────────────────────────────────────────────────────────────────┤ │ │ │ PER-ENTITY FEATURES (for each player i, including self): │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ Core Stats: │ │ │ │ tiles, troops, gold, max_troops, troop_gain_rate │ │ │ │ │ │ │ │ Units (strategic indicators): │ │ │ │ cities, ports, factories, silos, sams, defense_posts │ │ │ │ warships, transports_in_flight │ │ │ │ │ │ │ │ Entity Type: │ │ │ │ is_bot, is_nation, is_human (one-hot) │ │ │ │ │ │ │ │ RELATIONSHIP TO ME (this is the key!): │ │ │ │ border_distance (manhattan, 0 = neighbor) │ │ │ │ is_neighbor (shares border) │ │ │ │ alliance_state (allied/neutral/enemy) │ │ │ │ betrayal_active (they have traitor debuff) │ │ │ │ my_betrayal_timer (if I betrayed them) │ │ │ │ my_attacks_to_them (troop count committed) │ │ │ │ their_attacks_to_me (troop count incoming) │ │ │ │ incoming_nuke_type (0/atom/hydrogen/mirv) │ │ │ │ │ │ │ │ Derived Strategic: │ │ │ │ shield = troops / tiles (defense density) │ │ │ │ threat = silos / (distance + 1) │ │ │ │ vulnerability = 1 / shield │ │ │ └───────────────────────────────────────────────────────────┘ │ │ ↓ │ │ Linear → 64-dim Entity Embedding │ │ ↓ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ MULTI-HEAD SELF-ATTENTION (2-4 heads) │ │ │ │ │ │ │ │ Each entity attends to all others │ │ │ │ Learns: "who threatens whom", "alliance clusters" │ │ │ │ Output: Contextualized entity embeddings │ │ │ └───────────────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ SELF FEATURES (special - my own state): │ │ │ │ │ │ │ │ growth_efficiency = troops / max_troops (optimal: 0.42) │ │ │ │ attack_ratio_committed (% of troops in attacks) │ │ │ │ can_afford: [warship, atom, hydrogen, mirv, city, │ │ │ │ port, factory, defense_post, silo, sam] │ │ │ │ win_progress = my_tiles / total_land (win at 0.8) │ │ │ └───────────────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ GLOBAL FEATURES: │ │ │ │ game_tick, game_phase, fallout_ratio │ │ │ │ num_alive_players, num_alive_bots, num_alive_nations │ │ │ └───────────────────────────────────────────────────────────┘ │ │ ↓ │ │ Combined: [self_embed, global, attention_pooled] │ │ ↓ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ ACTION HEADS: │ │ │ │ │ │ │ │ TYPE HEAD: MLP → 9 logits │ │ │ │ (NoOp, Attack, Boat, Build, AllyRequest, AllyAccept, │ │ │ │ LaunchNuke, BreakAlliance, CancelAttack) │ │ │ │ │ │ │ │ TARGET HEAD: Query-Key attention over entity embeddings │ │ │ │ → Score for each entity as target │ │ │ │ (Principled! Directly attends to WHO to interact with) │ │ │ │ │ │ │ │ INTENSITY HEAD: MLP → 4 logits (attack ratio) │ │ │ └───────────────────────────────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────────┘

What About Spatial Features?

For OpenFront, spatial features are secondary. The key decisions are entity-level. However, some spatial info is useful:

Option 1: Aggregate Spatial Features (Recommended) terrain_composition: - pct_plains, pct_highlands, pct_mountains (in my territory) - pct_contested_border (border tiles under attack)

expansion_opportunities: - num_annexable_territories (surrounded enemies) - nearest_neutral_land_distance - water_crossing_available (can boat somewhere)

Option 2: Remove Spatial Entirely

Entity features + distances may be sufficient
Simplest, fastest, smallest model

Option 3: Tiny Spatial (8×8)

Very coarse grid just for general map awareness
Process with small CNN or just flatten

Concrete Implementation Plan

Phase 1: Fix Observations (High Impact)

Add to the JS/Rust observation pipeline: // Per-enemy (add to existing): border_distance // Manhattan distance between closest borders is_neighbor // Boolean: shares border alliance_state // 0=enemy, 0.5=neutral, 1=allied betrayal_active // Boolean: has traitor debuff ports // Port count factories // Factory count

// Self (add): max_troops // Population cap troop_gain_rate // Troops per tick at current state troops_committed // Troops in active attacks can_afford_warship // Boolean can_afford_atom // Boolean can_afford_hydrogen // Boolean can_afford_mirv // Boolean can_afford_city // Boolean can_afford_silo // Boolean

Phase 2: Entity Architecture (Medium Effort)

Replace CNN with attention-based entity encoder:

~30 features per entity × 10-20 entities
2-layer transformer encoder
Target head uses attention over entity embeddings
Total params: ~200-400K (vs 1.5M current)

Phase 3: Simplify Spatial (Optional)

Either remove or reduce to aggregate features.

Summary

The CNN architecture fundamentally misunderstands the game. OpenFront is a diplomacy and resource management game where the key questions are:

WHO to interact with (requires entity relationships, distances)
WHAT to do (requires self-state, affordability)
HOW MUCH to commit (requires growth optimization knowledge)

The spatial grid tells you terrain and ownership but not which enemy owns what or how far away they are. An entity-based architecture with explicit relationship features would be both more effective and more parameter-efficient.

Want me to start implementing Phase 1 (adding the missing observation features)?

===

Notes

This is my first blog post An interesting finding is that the performance varies depending on the amount of real actions that are taken in environment:

If the agent spawns in and only takes NOOPs the samples per second are around 27,000. Whereas if the agent spawns in an takes real actions related to building and combat and the like, SPS is about 300. Self-play on the other hand seems to perform better than a single agent-bot environment with SPS being around 1,700. This is due to the saved compute by not having to delegate bot AI computation inside the game engine. The takeaway here seems to be that for performance the fewer FFI calls to V8, the better, and the goal should be to shift as much work to Rust as possible or at the very least try to reduce computation on the game engine.

References

[Kalyanakrishnan et al., 2021] "An Analysis of Frame-skipping in Reinforcement Learning"
[arXiv]
[OpenAI et al., 2019] "Dota 2 with Large Scale Deep Reinforcement Learning"
[arXiv]
[Schulman et al., 2017] "Proximal Policy Optimization Algorithms"
[arXiv]