Running now
Kai
Autonomous AI Executive Agent · Mac Mini · 24/7
23
scheduled jobs
1.83B
cache tokens / 2 mo
$30–60
current / month
gemini-2.5-flash · 1M ctx fallback: qwen3:14b (local) 07:45 — morning briefing delivered
The Problem

I used to spend two hours every morning reading. Credit markets, macro, my portfolio, prediction market odds, research threads, Substack drops from the night before. Good information - but it was manual, repetitive, and happening in my head instead of somewhere persistent.

Two years in restructuring gives you a very specific kind of attention. You read for signal - bankruptcy filings, DIP amendments, credit spread moves, the one paragraph buried in a 10-Q that changes the whole picture. I wanted that same lens running 24/7, not just the two hours I had in the morning before work started.

At some point I realized: this is all just pattern matching and retrieval. Exactly what language models are for. The question wasn't whether to automate it - it was how far to push it, and how to make it actually know me rather than just know the internet.

So I built Kai. A locally-hosted agent running on a Mac Mini under my desk - connected to my life. It holds 13,000+ files of personal context. It knows my portfolio, my thesis areas, my reading list, the Discord channels I care about, the Polymarket positions I'm tracking, the India angle I'm always thinking about. And every morning, before I pick up my phone, it's already done two hours of work and sent the summary to Telegram.

What it knows about me
Background
Wharton econ → restructuring analyst (TruBoard, TD Securities, GLC Advisors) → building AI systems. Two years reading credit agreements, DIP financing structures, and distressed healthcare. That lens doesn't turn off.
Thesis Areas
Agent payment rails, prediction markets, AI-native fintech, Indian MSME, usage-based billing, special situations. Kai monitors all of these without being asked to.
What it watches
Polymarket odds on positions I hold. Hyperliquid flows. Credit events in my coverage universe. Citrini Research drops. Selected Discord channels. Macro releases that actually matter.
Memory
13,000+ files, vector and full-text indexed. ~200 active sessions. It remembers what I told it three months ago. It knows which arguments I found compelling and which I dismissed.
A Day with Kai
07:45
Morning Briefing fires
Without prompting, a message arrives on my phone. Macro snapshot, credit events in my coverage universe, portfolio flags, and three things I should act on today. I didn't open an app. I didn't type anything. It just happened - the cron job fired, the agent pulled live context, and Telegram delivered the result.
research-agent portfolio-agent 2,847 tokens · $0.021
overnight
Research rabbit hole runs while I sleep
The overnight research session goes deep - Substack drops, credit filings, prediction market movements, macro data from Asian sessions. It doesn't just retrieve; it synthesizes. By morning, a curated reading queue is already in my Obsidian vault, annotated with why each item matters given what I've been thinking about.
research-agent Obsidian vault
every 4h
Cross-reference sweep
A watchdog runs every four hours, silently. It cross-references signals - new credit filings, macro data releases, mentions in monitored Discord channels - and surfaces only what changed. No noise, no duplicate alerts. If nothing moved, nothing gets sent.
signal-agent discord-monitor
12:00
Portfolio P&L pulse
Midday, a quick Moomoo pull. Daily P&L, stop-loss proximity, any positions approaching a flag threshold. No financial advice - just a clean read of the numbers so I don't need to open the brokerage app to know if something needs attention.
portfolio-agent weekdays only
18:30
Evening Briefing + research rabbit hole
End of day wrap - what happened that mattered, anything to carry forward tomorrow. Sometimes the research rabbit hole agent goes longer here, pulling threads from the morning briefing deeper into primary sources. The good stuff lands in my notes vault by the time I check my phone again.
research-agent Obsidian sync
Live Run — Morning Briefing
0
prompt tokens
0
cache read
0
output tokens
$0.000
Architecture
Cron Scheduler
23 jobs · time-based triggers · no cloud
system crontab
Mac Mini · local
🔌
OpenClaw Gateway
Persistent daemon · loopback WebSocket · native MCP
ws://127.0.0.1:18789
13K+ memory files · 200 sessions
🔗
MCP Tool Layer
Native Model Context Protocol — tools available to every agent call
Telegram Brave Search Filesystem · Obsidian
Gemini 2.5 Flash
Primary workhorse. 1M context. No truncation risk.
~80% volume
Claude Sonnet
Precision tasks — trade signals, Polymarket, hard synthesis.
consequential calls
Qwen 3:14b
Local via Ollama. Free. Watchdogs + lightweight polling.
free fallback
📲
Delivery Channels
Telegram · WhatsApp · Obsidian vault
push to phone
no app required
The 23 Jobs
Daily Briefings 2 jobs
Morning Briefing · 07:45
Evening Briefing · 18:30
Research 5 jobs
Morning / Midday / Evening
Overnight · Pre-market
Research Rabbit Hole
Portfolio 2 jobs
Moomoo Daily P&L · 12:00
Portfolio Summary · 16:30
Market Signals 4 jobs
Signal Deep Synthesis
Cross-reference Sweep · 4h
Weekly Sector Momentum
Weekly Earnings Scanner
Monitoring 4 jobs
Discord Monitor · 5 min
Notification Account
Checker Agent Self-Audit · 4h
Brent Price Monitor ⚠
System & Business 4 jobs
Process Health Watchdog · 5m
Daily Spend Report
Offload Business Enforcer
Weekly Pattern Report
Token Economics — Three Phases

The system didn't start cheap. It started as a $200/month Claude Max subscription running as an unofficial API backend - which worked brilliantly until Anthropic changed the rules in April 2026. What followed was a forced migration that turned out to be the most educational part of the whole build.

ENDED APR 2026
Phase 1 — Claude Max OAuth
Feb – Apr 2026
claude-opus-4 · $200/mo subscription → API proxy
$1,763
API equivalent / month (peak)
107K
median session tokens
1.83B
cache read tokens (2 mo)
The 1.83B cache read number is what made this work. At $1.50/M retail, that's $2,745 in cache reads over two months - paid as a $400 subscription. The subscription-as-API hack made the economics absurd. Anthropic eventually closed it.
Monthly cost$1,763 equiv.
ENDED JUN 2026
Phase 2 — DeepSeek V3
Apr – Jun 2026
deepseek/deepseek-chat · fallback: ollama/qwen3:14b (local)
~$80
monthly cost
128K
context limit
29%
calls that would exceed ctx
Context window was the killer. 29% of existing sessions exceeded DeepSeek's 128K limit - required surgery on the prompt stack or silent truncation. Credits exhausted June 2026, forcing the next move.
Monthly cost~$80
CURRENT
Phase 3 — Gemini 2.5 Flash
Jun 2026 — now
google/gemini-2.5-flash · 1M context · fallback: qwen3:14b
$30–60
monthly cost
1M
context tokens
60×
cheaper cache reads vs Anthropic
1M context window means zero truncation risk across all 23 jobs. 60x cheaper cache reads than Anthropic's retail rate. The tradeoff: weaker multi-step agentic reasoning. Consequential calls (trade signals, Polymarket) still route to Claude Sonnet via the Anthropic API directly.
Monthly cost$30–60
What's Next
⚖️
Critic Layer
Gemini generates. Claude cross-examines. Model disagreement becomes a signal to escalate instead of a silent failure.
💰
Agent Payments
The next frontier is agents that transact. Kai monitors markets - the logical next step is Kai executing decisions with proper rails.
🌐
Open Architecture
The gateway pattern - local daemon, loopback WebSocket, model-agnostic routing - is worth documenting properly. Stay tuned.