← BATTLE
ARC, Solomon

The Origin

An AI-native fund architecture where LLMs generate conviction and math manages risk. Built from zero in Claude Code. This is the thesis.

SCROLL
01

The Thesis

Traditional quant excels at structured data — correlations, volatility, mean reversion. But markets increasingly move on narrative. A single policy shift. A change in AI sentiment. A geopolitical escalation.

That signal lives in language, not spreadsheets. LLMs can process it. Quant models structurally cannot.

AI generates the signals. Math manages the risk.

In a traditional quant fund, math generates the signals and humans manage the risk. Solomon flips the stack. The AI is the edge. The quant layer serves as guardrails, not signal generators.

This mirrors how the best discretionary macro funds operate: deep contextual reasoning about the world, constrained by systematic risk management. Solomon automates the reasoning layer while keeping the risk layer deterministic and auditable.

02

The Council

A single LLM prompt produces a single opinion. Solomon splits reasoning across multiple specialized agents with structural biases, creating adversarial tension that mirrors real fund governance.

A PM wants to buy. Risk wants to sell. The CIO decides within policy constraints. Except the seats are filled by AI, the constraints are enforced by deterministic code, and every decision is journaled.

Scout
Macro Oracle
Reads the economic landscape. Interprets macro data and regime state into structured signals.
Scout
Narrative Intelligence
Detects narrative shifts in real-time. The AI-native edge — finding what quant cannot.
Advocate
Thesis Analyst
Synthesizes all signals. Structural bias: bullish. Finds reasons to deploy capital.
Adversary
Risk Sentinel
The last line of defense. Structural bias: bearish. Has the power to veto the entire session.
VETO SEAT

Models are tiered by function, not thrown at the problem uniformly. Scouts use fast, cheap models. Reasoning agents use mid-tier. The veto seat gets the flagship. The most consequential decision gets the best brain.

03

The Math

The quant layer doesn't generate alpha. It prevents ruin. Every conviction the AI council produces passes through deterministic risk infrastructure before a single dollar moves.

Regime Detection
HMM
Hidden Markov Model classifies market regime. Position sizing adjusts dynamically based on state.
Downside Risk
VaR
Value at Risk quantifies worst-case portfolio loss. Feeds the Risk Sentinel's veto calculus.
Position Sizing
Kelly
Kelly criterion determines optimal allocation given edge and variance. Prevents over-concentration.
Risk-Adjusted Return
Sharpe
Sharpe ratio measures return per unit of risk. Tracked per theme to identify which convictions pay off.

Hard guardrails on conviction changes, position sizing, turnover, and cash reserves. The Fund Manager and Execution Strategist are pure deterministic code — no LLM can override the risk constraints. The math doesn't negotiate.

04

8 Themes

Solomon invests thematically, not ticker-by-ticker. Each theme carries a conviction score that determines capital allocation. Why thematic? Because that's where LLM reasoning adds value. Asking AI to predict a stock price is pointless. Asking it whether the AI infrastructure narrative is strengthening — that's worth asking.

AI & Compute Aggressive
Energy & Grid Moderate
Defense & Aerospace Moderate
Quantum Computing Aggressive
Biotech & Health Moderate
Real Estate Moderate
Crypto & Digital Aggressive
Defensive Conservative

71 tickers across 8 themes. The AI council adjusts conviction each session. Higher conviction, more capital deployed.

05

The Battle

Two identical instances. Same architecture. Same themes. Same data. Same guardrails. Same prompts. The only variable is the LLM brain.

Anthropic
Claude Model Family
VS
OpenAI
GPT Model Family

Models matched by function — flagship vs flagship on the veto seat, mid-tier vs mid-tier on reasoning, fast vs fast on data transforms. Each instance has its own paper trading account, its own decision journal, and its own rate limiter.

This isn't a benchmark on test questions. It's a benchmark on capital allocation. Separate accounts. Real market data. Side by side. The model family that demonstrates better reasoning over time earns the right to manage real capital.

06

The Loop

The weekly review is where it gets recursive. Inspired by Karpathy's autoresearch — an AI system analyzing its own performance to propose improvements.

Each council's flagship model reviews its own week of decisions. What worked. What didn't. Where the agents were systematically wrong. Not just outcome analysis — reasoning analysis.

Weeks 1–2
Clean baseline. Pure reasoning comparison. No self-modification. Establish how each model family thinks about risk, narrative, and conviction from zero.
Week 3+
Close the loop. Each council calibrates its own parameters based on realized performance. The system doesn't just decide — it learns to decide better.
Recursive Self-Calibration
Agent accuracy tracking. Systematic bias detection. The decision journal becomes the training data for the system's own improvement.
The decision journal is the dataset. It's not just about who makes more money — it's about understanding how each model thinks about risk.
Built With
Co-authored with Claude Code
Every line of architecture, every agent prompt, every risk model, and this page — AI-driven development from thesis to deployment.
Watch the Battle →