Understanding Osprey#

The Osprey Framework is a production-ready architecture for deploying agentic AI in large-scale, safety-critical control system environments. Built on LangGraph’s StateGraph foundation, its distinctive architecture centers on Classification and Orchestration - capability selection followed by plan-first execution planning - providing transparent, auditable multi-step operations with mandatory safety checks for hardware-interacting workflows.

Osprey Framework Workflow Overview
Production Deployment Example: This diagram illustrates the framework architecture using capabilities from the ALS Accelerator Assistant - our production deployment at Lawrence Berkeley National Laboratory’s Advanced Light Source particle accelerator.

Processing Pipeline#

All user interactions—from CLI, web interfaces, or external integrations—flow through a unified Gateway that normalizes input and coordinates the processing pipeline. The framework converts natural-language requests into transparent, executable plans through four stages:

1. Task Extraction

Converts conversational context into structured, actionable objectives. Transforms arbitrarily long chat history and external data sources into a well-defined current task with explicit requirements and context. Integrates facility-specific data from channel databases, archiver systems, operational memory, and knowledge bases to enrich task understanding.

2. Classification

Dynamically selects relevant capabilities from your domain-specific toolkit. LLM-powered binary classification for each capability ensures only relevant prompts are used in downstream processes, preventing prompt explosion as facilities expand their capability inventories.

3. Orchestration

Generates complete execution plans with explicit dependencies and human oversight. Plans are created upfront before any capability execution begins, enabling operator review of all proposed control system operations and preventing capability hallucination.

4. Execution

Executes capabilities with checkpointing, artifact management, and comprehensive safety enforcement. Pattern detection and static analysis identify hardware writes, PV boundary checking verifies setpoints against facility-defined limits, and approval workflows ensure operator oversight before any control system interaction. Plans execute step-by-step with LangGraph interrupts for human approval and containerized isolation for generated code.

Framework Functions#

Converts conversational input into structured, actionable tasks:

# Chat history becomes focused task
current_task = await _extract_task(
    messages=state["messages"],
    retrieval_result=data_manager.retrieve_all_context(request),
)
  • Context Compression: Refines lengthy conversations into precise, actionable tasks

  • Datasource Integration: Enhances tasks with structured data from external sources

  • Self-Contained Output: Produces tasks that are executable without relying on prior conversation history

Task classification determines which capabilities are relevant:

# Each capability gets yes/no decision
active_capabilities = await classify_task(
    task=state.current_task,
    available_capabilities=registry.get_all_capabilities()
)
  • Binary Decisions: Yes/no for each capability

  • Prompt Efficiency: Only relevant capabilities loaded

  • Parallel Processing: Independent classification decisions

Creates complete execution plans before any capability runs:

# Generate validated execution plan
execution_plan = await create_execution_plan(
    task=state.current_task,
    capabilities=state.active_capabilities
)
  • Upfront Planning: Complete plans before execution

  • Plan Validation: Prevents capability hallucination

  • Deterministic Execution: Router follows predetermined steps

Manages conversation context and execution state:

# Persistent context across conversations
StateManager.store_context(
    state, "PV_ADDRESSES", context_key, pv_data
)
  • Conversation Persistence: Context survives restarts

  • Execution Tracking: Current step and progress

  • Context Isolation: Capability-specific data storage

Human oversight through LangGraph interrupts:

# Request human approval
if requires_approval:
    interrupt(approval_data)
    # Execution pauses for human decision
  • Planning Approval: Review execution plans before running

  • Code Approval: Human review of generated code

  • Native Integration: Built on LangGraph interrupts

🚀 Next Steps

Now that you understand the framework’s core concepts, explore the architectural principles that make it production-ready and scalable:

🏗️ Infrastructure Architecture

Gateway-driven pipeline, component coordination, and the three-pillar processing architecture

Infrastructure Architecture: Classification-Orchestration Pipeline
🔧 Convention over Configuration

Configuration-driven component loading, decorator-based registration, and eliminating boilerplate

Convention over Configuration: Configuration-Driven Registry Patterns
🔗 LangGraph Integration

StateGraph workflows, native checkpointing, interrupts, and streaming support

LangGraph Integration: Native StateGraph and Workflow Execution
🎯 Orchestrator-First Philosophy

Why upfront planning outperforms reactive tool calling and improves reliability

Orchestrator-First Architecture: Upfront Planning in Practice