All Posts Engineering

We Built OpenAstra to Solve Our Own Agent Infrastructure Problems

February 25, 20267 min readContra Collective
⚙️

We Built OpenAstra to Solve Our Own Agent Infrastructure Problems

The Contra Collective team kept hitting the same wall on agentic projects: the infrastructure layer was consuming more engineering time than the actual agent logic. Memory management, multi-channel routing, tool execution, swarm coordination — each project meant rebuilding the same scaffolding from scratch, or contorting a framework that wasn't designed for production use.

So we built OpenAstra.

OpenAstra is a self-hosted, open-source agent runtime. It's MIT licensed, runs on Node 20+, and ships with the production infrastructure we engineered to solve these problems internally. We're releasing it because the problems it solves aren't unique to us — every engineering team building serious agentic systems runs into the same gaps.

The Problem We Were Actually Solving

The honest answer is that most agent tooling is built for demos. It works beautifully in a Jupyter notebook, falls apart under real operational conditions.

The specific failure modes we kept hitting:

Memory without structure. Every agent framework offers a vector store. None of them have a coherent answer for how to handle the difference between what an agent is reasoning about right now, what it did last week, and what it knows about your business domain. We were hand-rolling memory tiering on every project.

Channel sprawl. Our internal agents needed to operate across Slack, email, GitHub, and other surfaces. Building a separate integration per channel, per project, was unsustainable. We needed a runtime that treated multi-channel as a first-class concern, not an afterthought.

No permission model. In our own workflows, agents touching customer data should never have access to deployment infrastructure. Agents doing research shouldn't have payment system access. Most swarm implementations have no answer for this. We needed real permission sandboxing baked into the runtime, not bolted on.

Inference cost opacity. Running multiple agents across multiple LLM providers with no visibility into where spend was going was a recurring operational problem. We needed cost tracking built into the runtime, not an external spreadsheet.

OpenAstra was engineered to solve all four.

What We Shipped

A 5-Tier Memory Architecture

The most significant engineering investment in OpenAstra is the memory system. Instead of a single vector store, we built five distinct memory layers:

  • Working memory — current reasoning context
  • Episodic memory — timestamped record of past actions and outcomes
  • Semantic memory — domain knowledge in vector form
  • Procedural memory — learned tool usage patterns
  • Shared memory — cross-agent state for swarm coordination

Retrieval across these layers uses Reciprocal Rank Fusion (RRF) — merging results from pgvector (semantic search) and Typesense (keyword and structured search) into a single relevance-ranked output. This matters because real-world agent queries are never purely semantic or purely keyword — they're usually both. RRF handles the blend correctly.

Multi-Channel Integration as a First-Class Primitive

OpenAstra normalizes inputs from Slack, GitHub, X, email, and other channels into a consistent event format at the runtime level. The same agent handles a Slack message, a GitHub webhook, and an incoming email without channel-specific code paths.

This was one of the first things we built, because it was one of the first things we needed. Our internal agents don't live in a single channel — they operate wherever work happens.

Hierarchical Swarms with Permission Sandboxing

The swarm model in OpenAstra allows a root agent to spawn sub-agents with explicitly scoped permissions. Sub-agents can only access the tools and data their parent explicitly grants them — nothing more.

We built this because we needed it ourselves. Running agents with access to everything is fine for a prototype. It's not acceptable in a production environment where agents are touching real business data.

106 Skills, 64 Tools, 10 LLM Providers

We packaged everything our team had built and reused across projects: 106 skills (higher-level capabilities like research, summarization, and content generation) and 64 core tools (web search, file operations, API calls, and more). These are the building blocks that let you deploy a capable agent without writing custom tool integrations from scratch.

LLM support covers 10 providers — OpenAI, Anthropic Claude, Google Gemini, local models via Ollama, and others. The provider abstraction means different agents can use different models based on what the task warrants, which is how you keep inference costs reasonable at scale.

Dream Mode

One of our own workflow needs that became a feature: running agents unattended overnight for long-horizon tasks. Research pipelines, batch data processing, analysis jobs that don't need real-time supervision. We called it Dream Mode — agents working while you sleep. It's a first-class runtime configuration, not a hack.

Hot Reload and Cost Tracking

Two things that matter the moment you're running agents in production rather than development:

Hot reload — code changes apply without a runtime restart. We needed this for fast iteration on our own agents, and it stays in.

Cost tracking dashboards — every LLM call is tracked by agent, task, and provider over time. When you're running multiple agents across multiple models, this is the instrumentation that keeps the economics manageable.

Why We Open-Sourced It

We build agentic systems for clients, and we built OpenAstra to power that work more efficiently. Open-sourcing it isn't charity — it's the right decision for a runtime that benefits from community tooling, external skill contributions, and broad adoption driving the ecosystem forward.

The MIT license means no strings. Self-hosted means your data stays yours. Fully auditable code means no black boxes in your production infrastructure.

If you're building agents that need to run in the real world, OpenAstra is where we'd start.

Want to discuss this topic?

Start a Conversation