Skip to content
Back to Blog
·7 min read

From Council to Production: Shipping 38 AI Projects in 60 Days

AI AgentsCouncilMulti-AgentProductionEngineering

Two months ago I committed to a small experiment: instead of brainstorming new project ideas the usual way (open a doc, stare at it, eventually drift into Twitter), I'd let a multi-agent system do the picking. Council — eight proposal agents, a fact-check pipeline, a structured debate, and a calibrating judge — would meet, vote, and hand me the top three ideas every day.

The deal was simple: I'd build the #1 idea most days. No second-guessing the agents. No re-running until I liked the answer.

Sixty days later, the portfolio went from 6 production projects to 44. This post is what I learned about letting an AI system drive the roadmap, and what it actually took to ship them.

How Council Decides

Eight specialized agents (Researcher, Creative, Analyst, Financial, Customer, Competitive, Physical, Trend Forecaster) each propose a project from a different perspective. The Customer agent reads Reddit and surfaces real pain. The Competitive agent maps the landscape and looks for unprotected gaps. The Trend Forecaster reads VC filings and patent activity to bias toward what will matter in 6–12 months.

The proposals then go through a 10-phase pipeline:

  1. Cross-pollination — agents read each other's drafts and revise.
  2. Fact check — every numerical claim hits live web search.
  3. Critique — Contrarian agent attacks every proposal.
  4. Defense — Advocate builds the strongest counter-case.
  5. Structured debate — pros and cons argued in writing.
  6. Calibration — scores normalized across agents.
  7. Judgment — impartial verdict with confidence scores.
  8. Synthesis — top 3 ranked with reasoning.
  9. Veto check — minority dissents flagged.
  10. Business plan — auto-generated for the #1 pick.

The whole thing runs in under 4 minutes for ~$0.40 in Claude API calls. With --deep mode (Opus instead of Sonnet) it runs ~12 minutes for ~$2.

What Got Picked, and Why

Looking back across 31 sessions, three patterns emerged:

1. Compliance and developer trust kept winning early

Sessions in March were dominated by compliance and AI-trust projects: DevTrust Shield (code verification for regulated industries), AI Compliance Navigator (multi-state AI regs), CodeTrust (multi-agent verification of AI-generated code), RegBot (compliance OS for tech startups), ComplianceBot (bilingual compliance for Spanish-speaking SMBs in the US), ComplianceAPI Hub (Mexico's NOM/CFDI/IMSS/COFEPRIS as one API).

This wasn't a theme I picked — the agents kept surfacing it because the Customer agent was finding the same complaints in fintech and healthtech subreddits week after week.

2. Cost intelligence became a category

As Claude API spend grew, the council started recommending cost-control infrastructure: CostGuard (real-time circuit breakers per agent), AgentSafe (runtime cost + safety monitoring), APIRouter (cost-quality routing across providers), CostIntel (DevOps cost intelligence with zombie detection), and GreenCompute (energy-optimized routing). These weren't separate ideas — they're a stack.

3. Mexico-focused vertical SaaS, once the council learned my context

Around April 4th the council started recommending LATAM-specific projects, almost certainly because the Analyst agent (which scans the local dev environment) noticed a cluster of Spanish-language work and the Customer agent surfaced Mexican SMB pain. From there came AgroFlow (Michoacán supply chain), FloraFlow (Estado de México floriculture), GuadalajIT (Guadalajara nearshoring intelligence), WaFlow (WhatsApp AI for Morelia service businesses), and a vertical health stack (EntrenadorIA, FisioFlow, TerapiaFlow) for clinics in Morelia.

What It Actually Took to Ship

The romantic version is "agents propose, human disposes." The reality was more grindy.

Most projects share a skeleton. Python + Typer + Pydantic + Rich for the CLI, Claude API for the AI, FastAPI when a web layer is needed. Once that template was solid, new projects took 2–6 hours to scaffold to a working demo. The Council picks the destination; the template gets the car there.

Bilingual was non-negotiable. Every Mexico-focused project ships ES/EN from day one — error messages, prompts, generated content. Building this in from the start (rather than bolting it on later) saved hours per project.

Compliance is a feature, not a chore. The Mexican PT stack (FisioFlow, TerapiaFlow) generates SOAP notes that pass NOM-004-SSA3 audits. CFDI 4.0 invoicing is built into the billing layer. This makes the difference between "interesting demo" and "clinics will actually pay for this."

Some council picks didn't survive contact with reality. The "Premium EV Charging Hub in Fremont" (rank 3, three sessions in a row) is a great idea — but it's a real-estate project, not a software one. I logged it and moved on. The Physical agent keeps proposing physical-business plays; I keep saving them for later.

What I'd Tell Someone Trying This

  • Trust the agents on direction; trust yourself on scope. The council is excellent at picking what to build. It's mediocre at sizing how much to build. Cap each project at what fits in a focused weekend.
  • Let the agents see your context. The proposals got dramatically better once the Analyst agent could read my dev directory and the Creative agent could query my Second Brain. Generic LLM brainstorming gives generic output.
  • Keep the deep-mode budget tight. Most sessions don't need Opus. Reserve --deep for ambiguous calls where the calibrator flags low confidence.
  • Build the template first. A great agentic decision system is worthless if every project takes you a week to scaffold.

What's Next

The next wave is consolidation. Many of the 38 projects deserve to merge — the cost-control stack (CostGuard / AgentSafe / APIRouter / CostIntel) is already feeling like one platform. The Morelia health stack (EntrenadorIA / FisioFlow / TerapiaFlow) wants to become a single clinic OS.

The council itself is also getting upgrades: a new Memory agent that tracks which past picks shipped vs. stalled, and a Portfolio agent that flags when a new proposal overlaps too much with existing work.

If you want to see the full portfolio, it's at /projects. If you want to read about specific projects, the case studies over at The Brainy Guys go deeper into a few of them.

Need AI agents for your business?

At The Brainy Guys, we build and deploy production AI agents on dedicated infrastructure.

Learn More

Get AI & engineering insights

Articles on AI agents, distributed systems, and software architecture. No spam, unsubscribe anytime.