Registry and Discovery#

The Osprey Framework implements a centralized component registration and discovery system that enables clean separation between framework infrastructure and application-specific functionality.

System Overview#

The registry system uses a two-tier architecture:

Framework Registry: Core infrastructure components (nodes, base capabilities, services)
Application Registries: Domain-specific components (capabilities, context classes, data sources)

Applications register components by implementing RegistryConfigProvider in their registry.py module. The framework loads registries from the path specified in configuration:

# config.yml
registry_path: ./src/my_app/registry.py

Registry Modes#

The registry system supports two modes for application registries, detected automatically based on the registry type. Most applications should use Extend Mode (recommended default).

Extend Mode (Recommended)

Extend Mode allows applications to build on top of framework defaults by adding domain-specific components while automatically inheriting all framework infrastructure.

When to use:

Most applications (95% of use cases)
When you want automatic framework features
When you need to add domain-specific components
When you want easier framework upgrades

How it works:

Framework components load first, then application components merge in. Applications can add, exclude, or override specific framework components.

Type marker:

Returns ExtendedRegistryConfig (via extend_framework_registry() helper)

Example:

from osprey.registry import (
    RegistryConfigProvider,
    extend_framework_registry,
    CapabilityRegistration,
    ContextClassRegistration,
    ExtendedRegistryConfig
)

class MyAppRegistryProvider(RegistryConfigProvider):
    def get_registry_config(self) -> ExtendedRegistryConfig:
        return extend_framework_registry(
            capabilities=[
                CapabilityRegistration(
                    name="weather",
                    module_path="my_app.capabilities.weather",
                    class_name="WeatherCapability",
                    description="Get weather data",
                    provides=["WEATHER_DATA"],
                    requires=[]
                )
            ],
            context_classes=[
                ContextClassRegistration(
                    context_type="WEATHER_DATA",
                    module_path="my_app.context_classes",
                    class_name="WeatherContext"
                )
            ]
        )

Benefits:

Less boilerplate code (only specify your components)
Easier framework upgrades (new features automatically included)
Full framework capability access without explicit registration
Flexible (can exclude or override framework components as needed)

Standalone Mode (Advanced)

Standalone Mode gives applications complete control by requiring them to define ALL components, including framework infrastructure. The framework registry is NOT loaded.

When to use:

Special cases requiring full control over all components
Minimal deployments that don’t need full framework
Custom framework variations

How it works:

Application provides complete registry including ALL framework components. Framework registry is skipped entirely.

Type marker:

Returns RegistryConfig directly (not via helper)

Example:

from osprey.registry import (
    RegistryConfigProvider,
    RegistryConfig,
    NodeRegistration,
    CapabilityRegistration,
    # ... all other registration types
)

class StandaloneRegistryProvider(RegistryConfigProvider):
    def get_registry_config(self) -> RegistryConfig:
        return RegistryConfig(
            # Must provide ALL framework nodes
            core_nodes=[
                NodeRegistration(
                    name="orchestrator",
                    module_path="osprey.langgraph.nodes.orchestrator",
                    function_name="orchestrator_node",
                    description="Core orchestration"
                ),
                # ... ALL other framework nodes required
            ],
            # Must provide ALL framework capabilities
            capabilities=[
                CapabilityRegistration(
                    name="memory",
                    module_path="osprey.capabilities.memory",
                    # ... complete framework memory capability
                ),
                # ... ALL other framework capabilities + app capabilities
            ],
            # ... ALL other component types
        )

Warning

Standalone mode requires maintaining a complete list of framework components. This is significantly more maintenance overhead and is only recommended for advanced use cases. Framework upgrades may require registry updates.

Mode Detection:

The RegistryManager automatically detects the mode by checking the type returned by RegistryConfigProvider.get_registry_config():

isinstance(config, ExtendedRegistryConfig) → Extend Mode (load framework, then merge)
isinstance(config, RegistryConfig) but not ExtendedRegistryConfig → Standalone Mode (skip framework)

RegistryManager#

The RegistryManager provides centralized access to all framework components:

from osprey.registry import initialize_registry, get_registry

# Initialize the registry system
initialize_registry()

# Access the singleton registry instance
registry = get_registry()

# Access components
capability = registry.get_capability("weather_data_retrieval")
context_class = registry.get_context_class("WEATHER_DATA")
data_source = registry.get_data_source("knowledge_base")

Key Methods:

get_capability(name) - Get capability instance by name
get_context_class(context_type) - Get context class by type identifier
get_data_source(name) - Get data source provider instance
get_node(name) - Get LangGraph node function

Advanced Registry Patterns#

This section covers advanced patterns for customizing your application registry beyond the basic examples shown in Registry Modes above.

Excluding and Overriding Framework Components#

You can selectively exclude or replace framework components to customize behavior:

return extend_framework_registry(

    # Exclude generic Python capability (replaced with specialized turbine_analysis)
    exclude_capabilities=["python"],

    capabilities=[
        # Specialized analysis replaces generic Python capability
        CapabilityRegistration(
            name="turbine_analysis",
            module_path="my_app.capabilities.turbine_analysis",
            class_name="TurbineAnalysisCapability",
            description="Analyze turbine performance with domain-specific logic",
            provides=["ANALYSIS_RESULTS"],
            requires=["TURBINE_DATA", "WEATHER_DATA"]
        )
    ],
    context_classes=[...],

    # Add data sources
    data_sources=[
        DataSourceRegistration(
            name="knowledge_base",
            module_path="my_app.data_sources.knowledge",
            class_name="KnowledgeProvider",
            description="Domain knowledge retrieval"
        )
    ],

    # Add custom framework prompt providers
    framework_prompt_providers=[
        FrameworkPromptProviderRegistration(
            module_path="my_app.framework_prompts",
            prompt_builders={"response_generation": "CustomResponseBuilder"}
        )
    ]
)

Component Registration#

Capability Registration:

# In registry.py
CapabilityRegistration(
    name="weather_data_retrieval",
    module_path="my_app.capabilities.weather_data_retrieval",
    class_name="WeatherDataRetrievalCapability",
    description="Retrieve weather data for analysis",
    provides=["WEATHER_DATA"],
    requires=["TIME_RANGE"]
)

# Implementation in src/my_app/capabilities/weather_data_retrieval.py
from osprey.base import BaseCapability, capability_node

@capability_node
class WeatherDataRetrievalCapability(BaseCapability):
    name = "weather_data_retrieval"
    description = "Retrieve weather data for analysis"
    provides = ["WEATHER_DATA"]
    requires = ["TIME_RANGE"]

    async def execute(self) -> Dict[str, Any]:
        # Get required contexts automatically
        time_range, = self.get_required_contexts()
        # Implementation here
        return self.store_output_context(weather_data)

Context Class Registration:

# In registry.py
ContextClassRegistration(
    context_type="WEATHER_DATA",
    module_path="my_app.context_classes",
    class_name="WeatherDataContext"
)

# Implementation in src/my_app/context_classes.py
from osprey.context.base import CapabilityContext

class WeatherDataContext(CapabilityContext):
    CONTEXT_TYPE: ClassVar[str] = "WEATHER_DATA"
    CONTEXT_CATEGORY: ClassVar[str] = "LIVE_DATA"

    location: str
    temperature: float
    conditions: str
    timestamp: datetime

Data Source Registration:

# In registry.py
DataSourceRegistration(
    name="knowledge_base",
    module_path="my_app.data_sources.knowledge",
    class_name="KnowledgeProvider",
    description="Domain knowledge retrieval"
)

AI Provider Registration#

Applications can register custom AI providers for institutional services or commercial providers not included in the framework.

Basic Provider Registration:

# In src/my_app/registry.py
from osprey.registry import RegistryConfigProvider, ProviderRegistration
from osprey.registry.helpers import extend_framework_registry

class MyAppRegistryProvider(RegistryConfigProvider):
    def get_registry_config(self):
        return extend_framework_registry(
            capabilities=[...],
            context_classes=[...],
            providers=[
                ProviderRegistration(
                    module_path="my_app.providers.azure",
                    class_name="AzureOpenAIProviderAdapter"
                ),
                ProviderRegistration(
                    module_path="my_app.providers.institutional",
                    class_name="InstitutionalAIProvider"
                )
            ]
        )

Excluding Framework Providers:

You can exclude framework providers if you want to use only custom providers:

return extend_framework_registry(
    capabilities=[...],
    providers=[
        ProviderRegistration(
            module_path="my_app.providers.custom",
            class_name="CustomProvider"
        )
    ],
    exclude_providers=["anthropic", "google"]  # Exclude specific framework providers
)

Replacing Framework Providers:

To replace a framework provider with a custom implementation:

return extend_framework_registry(
    capabilities=[...],
    override_providers=[
        ProviderRegistration(
            module_path="my_app.providers.custom_openai",
            class_name="CustomOpenAIProvider"
        )
    ],
    exclude_providers=["openai"]  # Remove framework version
)

Provider Implementation:

Custom providers are useful for integrating institutional AI services (e.g., Stanford AI Playground, LBNL CBorg), commercial providers not yet in the framework (e.g., Cohere, Mistral AI), or self-hosted models with custom endpoints.

All custom providers must inherit from BaseProvider and implement three core methods:

create_model() - Create PydanticAI model instances for agent workflows
execute_completion() - Execute direct API calls (used by infrastructure nodes)
check_health() - Test connectivity and authentication (used by osprey health CLI)

# Implementation in src/my_app/providers/institutional.py
from typing import Any
from osprey.models.providers.base import BaseProvider
import httpx
import openai
from pydantic_ai.models.openai import OpenAIModel

class InstitutionalAIProvider(BaseProvider):
    """Custom provider for institutional AI service."""

    # Provider metadata - displayed in CLI and used by framework
    name = "institutional_ai"
    description = "Institutional AI Service (Custom Models)"
    requires_api_key = True
    requires_base_url = True
    requires_model_id = True
    supports_proxy = True
    default_base_url = "https://ai.institution.edu/v1"
    default_model_id = "gpt-4"
    health_check_model_id = "gpt-3.5-turbo"  # Cheapest for health checks
    available_models = ["gpt-4", "gpt-3.5-turbo", "custom-model"]

    # API key acquisition info (shown in CLI help)
    api_key_url = "https://ai.institution.edu/api-keys"
    api_key_instructions = [
        "Log in with institutional credentials",
        "Navigate to API Keys section",
        "Generate new API key",
        "Copy and save the key securely"
    ]
    api_key_note = "Requires active institutional affiliation"

    def create_model(
        self,
        model_id: str,
        api_key: str | None,
        base_url: str | None,
        timeout: float | None,
        http_client: httpx.AsyncClient | None
    ) -> OpenAIModel:
        """Create model instance for PydanticAI agents."""
        from pydantic_ai.providers.openai import OpenAIProvider

        # For OpenAI-compatible APIs, use OpenAI client
        if http_client:
            client = openai.AsyncOpenAI(
                api_key=api_key,
                base_url=base_url,
                http_client=http_client
            )
        else:
            client = openai.AsyncOpenAI(
                api_key=api_key,
                base_url=base_url,
                timeout=timeout or 60.0
            )

        return OpenAIModel(
            model_name=model_id,
            provider=OpenAIProvider(openai_client=client)
        )

    def execute_completion(
        self,
        message: str,
        model_id: str,
        api_key: str | None,
        base_url: str | None,
        max_tokens: int = 1024,
        temperature: float = 0.0,
        thinking: dict | None = None,
        system_prompt: str | None = None,
        output_format: Any | None = None,
        **kwargs
    ) -> str | Any:
        """Execute direct chat completion."""
        client = openai.OpenAI(api_key=api_key, base_url=base_url)

        messages = [{"role": "user", "content": message}]
        if system_prompt:
            messages.insert(0, {"role": "system", "content": system_prompt})

        # Handle structured output if requested
        if output_format:
            response = client.beta.chat.completions.parse(
                model=model_id,
                messages=messages,
                response_format=output_format,
                max_tokens=max_tokens,
                temperature=temperature
            )
            return response.choices[0].message.parsed
        else:
            response = client.chat.completions.create(
                model=model_id,
                messages=messages,
                max_tokens=max_tokens,
                temperature=temperature
            )
            return response.choices[0].message.content

    def check_health(
        self,
        api_key: str | None,
        base_url: str | None,
        timeout: float = 5.0,
        model_id: str | None = None
    ) -> tuple[bool, str]:
        """Test provider connectivity and authentication."""
        if not api_key:
            return False, "API key not configured"

        try:
            client = openai.OpenAI(
                api_key=api_key,
                base_url=base_url,
                timeout=timeout
            )
            # Minimal test with cheapest model
            test_model = model_id or self.health_check_model_id
            response = client.chat.completions.create(
                model=test_model,
                messages=[{"role": "user", "content": "Hi"}],
                max_tokens=5
            )
            if response.choices:
                return True, "API accessible and authenticated"
            return False, "API responded but no completion"
        except Exception as e:
            error_msg = str(e).lower()
            if "authentication" in error_msg or "api key" in error_msg:
                return False, "Authentication failed (check API key)"
            elif "timeout" in error_msg:
                return False, "Request timeout"
            else:
                return False, f"API error: {str(e)[:50]}"

Once registered, custom providers integrate seamlessly with the framework:

Available in osprey init interactive provider selection
Accessible via osprey.models.get_model() and osprey.models.get_chat_completion()
Tested automatically by osprey health command
Configuration managed in config.yml like framework providers

📋 Complete Provider Metadata Reference

Required Metadata Attributes:

Attribute	Description
`name`	Provider identifier (e.g., `"azure_openai"`, `"institutional_ai"`)
`description`	User-friendly description shown in TUI (e.g., `"Azure OpenAI (Enterprise)"`)
`requires_api_key`	`True` if provider needs API key for authentication
`requires_base_url`	`True` if provider needs custom base URL
`requires_model_id`	`True` if provider requires model ID specification
`supports_proxy`	`True` if provider supports HTTP proxy configuration

Optional Metadata Attributes:

Attribute	Description
`default_base_url`	Default API endpoint (e.g., `"https://api.openai.com/v1"`)
`default_model_id`	Recommended model for general use (used in project templates)
`health_check_model_id`	Cheapest/fastest model for `osprey health` checks
`available_models`	List of model IDs for CLI selection (e.g., `["gpt-4", "gpt-3.5-turbo"]`)
`api_key_url`	URL where users obtain API keys (e.g., `"https://console.anthropic.com/"`)
`api_key_instructions`	Step-by-step list of strings for obtaining the key
`api_key_note`	Additional requirements (e.g., `"Requires institutional affiliation"`)

Implementation Notes:

For OpenAI-compatible APIs, reuse OpenAIModel from PydanticAI
For provider-specific SDKs (Google, Anthropic), use their PydanticAI model classes
check_health() should use minimal tokens (typically 5-10 tokens, ~$0.0001 cost)
create_model() caller owns http_client lifecycle - don’t close it in provider
All metadata is introspected from class attributes (single source of truth)

Registry Initialization and Usage#

from osprey.registry import initialize_registry, get_registry

# Initialize registry (loads framework + application components)
initialize_registry()

# Access registry throughout application
registry = get_registry()

# Get capabilities
capability = registry.get_capability("weather_data_retrieval")
all_capabilities = registry.get_all_capabilities()

# Get context classes
weather_context_class = registry.get_context_class("WEATHER_DATA")

# Get data sources
knowledge_provider = registry.get_data_source("knowledge_base")

Registry Export for External Tools:

The registry system automatically exports metadata during initialization for use by external tools and debugging:

# Export happens automatically during initialization
initialize_registry(auto_export=True)  # Default behavior

# Manual export for debugging or integration
registry = get_registry()
export_data = registry.export_registry_to_json("/path/to/export")

# Export creates standardized JSON files:
# - registry_export.json (complete metadata)
# - capabilities.json (capability definitions)
# - context_types.json (context type definitions)

Export Configuration:

The default export directory is configured in config.yml:

file_paths:
  agent_data_dir: _agent_data
  registry_exports_dir: registry_exports

This creates exports in _agent_data/registry_exports/ relative to the project root. The path can be customized through the configuration system.

Component Loading Order#

Components are loaded lazily during registry initialization:

Context classes - Required by capabilities
Data sources - Required by capabilities
Providers - AI model providers
Core nodes - Infrastructure components
Services - Internal LangGraph service graphs
Capabilities - Domain-specific functionality
Framework prompt providers - Application-specific prompts

Best Practices and Troubleshooting#

Best Practices

Registry Configuration:

Keep registrations simple and focused
Use clear, descriptive names and descriptions
Define provides and requires accurately for dependency tracking

Capability Implementation:

Always use @capability_node decorator
Implement required attributes: name, description
Make execute() method static and async
Return dictionary of state updates

Application Structure:

Place registry in src/{app_name}/registry.py
Implement exactly one RegistryConfigProvider class per application
Organize capabilities in src/{app_name}/capabilities/ directory
Configure registry location in config.yml via registry_path

Common Issues

Component Not Found:

Verify component is registered in RegistryConfigProvider
Check module path and class name are correct
Ensure initialize_registry() was called

Missing @capability_node:

Ensure @capability_node decorator is applied
Verify name and description class attributes exist
Check that execute() method is implemented as static method

Registry Export Issues:

Check that export directory is writable and accessible
Verify auto_export=True during initialization for automatic exports
Use manual export_registry_to_json() for debugging specific states