Skip to content
Astro

RESEARCH

WE PUBLISH WHAT WE SHIP.

Research stays public when it settles an architecture decision we then deploy — a routing pattern, an evaluation method, a reliability tradeoff. Each piece carries a real result.

THE WORK

RESEARCH WITH A RESULT ATTACHED.

Two papers, two production decisions. Each one led to a routing or evaluation pattern we now run for clients — with the number it moved.

Multi-Agent AI

16 min · 2025

Multi-agent calendar intelligence: hybrid LLM + CP-SAT for executive scheduling

29.4%

cost reduction

Executive calendar management represents a constraint satisfaction problem characterized by high dimensionality, conflicting objectives, and dynamic updates. Traditional LLMs fail on complex scheduling (0.6% success on TravelPlanner). This research introduces the Cognitive Temporal Orchestration (CTO) framework—a hybrid architecture integrating heterogeneous LLM orchestration (GPT-5, Gemini 3 Pro, Claude Sonnet 4.5) with CP-SAT constraint programming. Through 81 test scenarios, we demonstrate 100% orchestration success, 100% high-value event identification, and 29.4% cost reduction. Critical analysis reveals 99% of latency originates from LLM inference, fundamentally informing optimization strategies. We validate three cognitive modules establishing a methodology for evaluating evolution from reactive assistants to proactive wealth management systems.

Why it matters: route reasoning across models and let a solver own the constraints — cheaper than one big model, and the schedule actually holds.

Read the paper

Clinical AI

10 min · 2025

Comparative LLM analysis for clinical decision support: routing across Gemini-3-Pro and GPT-5.1

94.7%

system reliability

This comprehensive evaluation of the Vitruviana Hybrid AI Architecture for clinical decision support analyzes model selection patterns, service integration, and clinical outcomes across 100+ automated tests. The hybrid architecture achieved 94.7% system reliability with intelligent task routing, demonstrating 100% optimal routing decisions and directing complex clinical reasoning to Gemini 3 Pro (67% of tasks) and structured tasks to GPT-5.1 (33% of tasks).

Why it matters: pick the model per task instead of standardizing on one — you get higher reliability without paying for the frontier model on every call.

Read the paper

WANT THIS RUNNING IN YOUR OPS? START WITH THE AUDIT.

THE PATTERN ON THIS PAGE, POINTED AT YOUR WORKFLOW.

These papers became routing and evaluation patterns we now run in production. Bring the workflow you'd want them applied to — we map which pattern fits, what it moves, and the costed plan to ship it. The audit credits toward the build.

Route
Evaluate
Ship
Research | Astro Intelligence Labs