ARIEL Search Service#

ARIEL (Agentic Retrieval Interface for Electronic Logbooks) is a logbook search service built on Osprey’s registry-based plugin architecture. It provides multiple search strategies — keyword, semantic, RAG, and agentic — through pluggable search modules that are discovered and composed at runtime. A capabilities-driven ingestion pipeline with registry-discovered adapters and enhancement modules keeps the search index up to date, while a deterministic execution pipeline coordinates retrieval, fusion, and answer generation for each query.

See also

Logbook Search Service (ARIEL)

Architecture overview, design rationale, and getting started

Search Modes

Search module implementation, pipeline registration, and mode selection

Data Ingestion

Ingestion adapters, enhancement modules, scheduling, and database schema

Osprey Integration

Capability, context flow, prompt builder, and error classification

Web Interface

Web interface architecture, REST API, and capabilities API

Data Models#

class osprey.services.ariel_search.models.EnhancedLogbookEntry[source]

Bases: TypedDict

ARIEL’s enriched logbook entry - the core data model.

Core fields are always present. Enhancement fields are added by enabled enhancement modules during ingestion.

entry_id: str
source_system: str
timestamp: datetime
author: str
raw_text: str
attachments: list[AttachmentInfo]
metadata: dict[str, Any]
created_at: datetime
updated_at: datetime
summary: NotRequired[str | None]
keywords: NotRequired[list[str]]
enhancement_status: NotRequired[dict[str, Any]]
class osprey.services.ariel_search.models.ARIELSearchRequest(query, modes=<factory>, time_range=None, facility=None, max_results=10, include_images=False, capability_context_data=<factory>, advanced_params=<factory>)[source]

Bases: object

Request model for ARIEL search service.

Captures all information needed to execute a search workflow.

query

The search query text

Type:

str

modes

Search modes to use (default: [RAG])

Type:

list[osprey.services.ariel_search.models.SearchMode]

time_range

Default time range filter (see Time Range Semantics)

Type:

tuple[datetime.datetime, datetime.datetime] | None

facility

Facility filter

Type:

str | None

max_results

Maximum results to return (default: 10, range: 1-100)

Type:

int

include_images

Include image attachments (default: False)

Type:

bool

capability_context_data

Context from main graph state

Type:

dict[str, Any]

query: str
modes: list[SearchMode]
time_range: tuple[datetime, datetime] | None = None
facility: str | None = None
max_results: int = 10
include_images: bool = False
capability_context_data: dict[str, Any]
advanced_params: dict[str, Any]
__post_init__()[source]

Validate request fields.

__init__(query, modes=<factory>, time_range=None, facility=None, max_results=10, include_images=False, capability_context_data=<factory>, advanced_params=<factory>)
class osprey.services.ariel_search.models.ARIELSearchResult(entries, answer=None, sources=<factory>, search_modes_used=<factory>, reasoning='', diagnostics=<factory>, pipeline_details=None)[source]

Bases: object

Structured, type-safe result from ARIEL search service.

Frozen dataclass for immutability.

entries

Matching entries, ranked by relevance

Type:

tuple[dict[str, Any], …]

answer

RAG-generated answer (optional)

Type:

str | None

sources

Entry IDs used as sources

Type:

tuple[str, …]

search_modes_used

Modes that were executed

Type:

tuple[osprey.services.ariel_search.models.SearchMode, …]

reasoning

Explanation of results

Type:

str

entries: tuple[dict[str, Any], ...]
answer: str | None = None
sources: tuple[str, ...]
search_modes_used: tuple[SearchMode, ...]
reasoning: str = ''
diagnostics: tuple[SearchDiagnostic, ...]
pipeline_details: PipelineDetails | None = None
__init__(entries, answer=None, sources=<factory>, search_modes_used=<factory>, reasoning='', diagnostics=<factory>, pipeline_details=None)
class osprey.services.ariel_search.models.SearchMode(value)[source]

Bases: Enum

Search mode enumeration.

KEYWORD

PostgreSQL full-text search (direct function call)

SEMANTIC

Embedding similarity search (direct function call)

RAG

Deterministic RAG pipeline with hybrid retrieval, RRF fusion, and LLM generation

AGENT

Agentic orchestration with ReAct agent (AgentExecutor)

KEYWORD = 'keyword'
SEMANTIC = 'semantic'
RAG = 'rag'
AGENT = 'agent'

Search Module Interface#

class osprey.services.ariel_search.search.base.SearchToolDescriptor(name, description, search_mode, args_schema, execute, format_result, needs_embedder=False)[source]

Bases: object

Everything the agent executor needs to wrap a search module as a tool.

name

Tool name for LangChain (e.g. “keyword_search”)

Type:

str

description

Tool description shown to the LLM

Type:

str

search_mode

Corresponding SearchMode enum value

Type:

SearchMode

args_schema

Pydantic model for tool input validation

Type:

type[BaseModel]

execute

Async function that performs the search

Type:

Callable[…, Awaitable[Any]]

format_result

Formats raw search results for the agent

Type:

Callable[…, dict[str, Any]]

needs_embedder

Whether this tool requires an embedding provider

Type:

bool

name: str
description: str
search_mode: SearchMode
args_schema: type[BaseModel]
execute: Callable[..., Awaitable[Any]]
format_result: Callable[..., dict[str, Any]]
needs_embedder: bool = False
__init__(name, description, search_mode, args_schema, execute, format_result, needs_embedder=False)
class osprey.services.ariel_search.search.base.ParameterDescriptor(name, label, description, param_type, default, min_value=None, max_value=None, step=None, options=None, section='General', placeholder=None, options_endpoint=None)[source]

Bases: object

Describes a tunable parameter for the frontend capabilities API.

name

Parameter key (e.g. “similarity_threshold”)

Type:

str

label

Human-readable label (e.g. “Similarity Threshold”)

Type:

str

description

Help text for the parameter

Type:

str

param_type

One of “float”, “int”, “bool”, “select”, “date”, “text”, “dynamic_select”

Type:

str

default

Default value

Type:

Any

min_value

Minimum value (float/int types)

Type:

float | None

max_value

Maximum value (float/int types)

Type:

float | None

step

Step increment (float/int types)

Type:

float | None

options

Choices for select type, e.g. [{“value”: “rrf”, “label”: “RRF”}]

Type:

list[dict[str, str]] | None

section

Grouping label in the advanced panel (e.g. “Retrieval”)

Type:

str

placeholder

Placeholder text for text inputs

Type:

str | None

options_endpoint

API endpoint for dynamic_select to fetch options

Type:

str | None

name: str
label: str
description: str
param_type: str
default: Any
min_value: float | None = None
max_value: float | None = None
step: float | None = None
options: list[dict[str, str]] | None = None
section: str = 'General'
placeholder: str | None = None
options_endpoint: str | None = None
to_dict()[source]

Serialize to a JSON-friendly dict.

Return type:

dict[str, Any]

__init__(name, label, description, param_type, default, min_value=None, max_value=None, step=None, options=None, section='General', placeholder=None, options_endpoint=None)

Keyword Search Module#

osprey.services.ariel_search.search.keyword.get_tool_descriptor()[source]#

Return the descriptor for auto-discovery by the agent executor.

Return type:

SearchToolDescriptor

osprey.services.ariel_search.search.keyword.get_parameter_descriptors()[source]#

Return tunable parameter descriptors for the capabilities API.

Return type:

list[ParameterDescriptor]

Execute keyword search against the logbook database.

Uses PostgreSQL full-text search with optional fuzzy matching fallback.

Parameters:
  • query (str) – Search query with optional operators (AND, OR, NOT) and field prefixes (author:, date:)

  • repository (ARIELRepository) – ARIEL database repository

  • config (ARIELConfig) – ARIEL configuration

  • max_results (int) – Maximum entries to return (default: 10)

  • start_date (datetime | None) – Filter entries after this time

  • end_date (datetime | None) – Filter entries before this time

  • author (str | None) – Filter by author name (ILIKE match)

  • source_system (str | None) – Filter by source system (exact match)

  • include_highlights (bool) – Include highlighted snippets (default: True)

  • fuzzy_fallback (bool) – Fall back to fuzzy search if no exact matches

Returns:

List of (entry, score, highlights) tuples sorted by relevance

Return type:

list[tuple[EnhancedLogbookEntry, float, list[str]]]

osprey.services.ariel_search.search.keyword.format_keyword_result(entry, score, highlights)[source]#

Format a keyword search result for agent consumption.

Parameters:
  • entry (EnhancedLogbookEntry) – EnhancedLogbookEntry

  • score (float) – Relevance score

  • highlights (list[str]) – Highlighted snippets

Returns:

Formatted dict for agent

Return type:

dict[str, Any]

class osprey.services.ariel_search.search.keyword.KeywordSearchInput(*, query, max_results=10, start_date=None, end_date=None)[source]

Bases: BaseModel

Input schema for keyword search tool.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

query: str
max_results: int
start_date: datetime | None
end_date: datetime | None
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Semantic Search Module#

osprey.services.ariel_search.search.semantic.get_tool_descriptor()[source]#

Return the descriptor for auto-discovery by the agent executor.

Return type:

SearchToolDescriptor

osprey.services.ariel_search.search.semantic.get_parameter_descriptors()[source]#

Return tunable parameter descriptors for the capabilities API.

Return type:

list[ParameterDescriptor]

Execute semantic similarity search.

Generates an embedding for the query and finds similar entries using cosine similarity.

Parameters:
  • query (str) – Natural language query

  • repository (ARIELRepository) – ARIEL database repository

  • config (ARIELConfig) – ARIEL configuration

  • embedder (BaseEmbeddingProvider) – Embedding provider (Ollama or other)

  • max_results (int) – Maximum entries to return (default: 10)

  • similarity_threshold (float | None) – Minimum similarity score (default: 0.5). Can be overridden per-query, then falls back to config, then to hardcoded default.

  • start_date (datetime | None) – Filter entries after this time

  • end_date (datetime | None) – Filter entries before this time

  • author (str | None) – Filter by author name (ILIKE match)

  • source_system (str | None) – Filter by source system (exact match)

Returns:

List of (entry, similarity_score) tuples sorted by similarity

Return type:

list[tuple[EnhancedLogbookEntry, float]]

osprey.services.ariel_search.search.semantic.format_semantic_result(entry, similarity)[source]#

Format a semantic search result for agent consumption.

Parameters:
  • entry (EnhancedLogbookEntry) – EnhancedLogbookEntry

  • similarity (float) – Cosine similarity score

Returns:

Formatted dict for agent

Return type:

dict[str, Any]

class osprey.services.ariel_search.search.semantic.SemanticSearchInput(*, query, max_results=10, similarity_threshold=0.5, start_date=None, end_date=None)[source]

Bases: BaseModel

Input schema for semantic search tool.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

query: str
max_results: int
similarity_threshold: float
start_date: datetime | None
end_date: datetime | None
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Pipeline Interface#

class osprey.services.ariel_search.pipelines.PipelineDescriptor(name, label, description, category, parameters=<factory>)[source]

Bases: object

Metadata for a pipeline execution strategy.

name

Pipeline key (e.g. “rag”)

Type:

str

label

Human-readable label (e.g. “RAG”)

Type:

str

description

What this pipeline does

Type:

str

category

“llm” or “direct”

Type:

str

parameters

Tunable parameters for this pipeline

Type:

list[osprey.services.ariel_search.search.base.ParameterDescriptor]

name: str
label: str
description: str
category: str
parameters: list[ParameterDescriptor]
__init__(name, label, description, category, parameters=<factory>)
osprey.services.ariel_search.pipelines.get_pipeline_descriptor(name)[source]#

Return a single pipeline descriptor by name.

Used by the central Osprey registry to look up specific pipelines.

Parameters:

name (str) – Pipeline name (e.g. “rag”, “agent”)

Returns:

The matching PipelineDescriptor

Raises:

KeyError – If no pipeline with the given name exists

Return type:

PipelineDescriptor

osprey.services.ariel_search.pipelines.get_pipeline_descriptors()[source]#

Return all pipeline descriptors.

Return type:

list[PipelineDescriptor]

Ingestion Interface#

osprey.services.ariel_search.ingestion.base.BaseAdapter

alias of FacilityAdapter

Enhancement Interface#

class osprey.services.ariel_search.enhancement.base.BaseEnhancementModule[source]

Bases: ABC

Abstract base class for enhancement modules.

Enhancement modules enrich logbook entries during ingestion. They run sequentially as a pipeline, each adding data to the entry.

abstract property name: str

Return module identifier.

Returns:

Module name (e.g., ‘text_embedding’, ‘semantic_processor’)

property migration: type[BaseMigration] | None

Return migration class for this module.

Override in subclasses that need database migrations.

Returns:

Migration class or None if no migration needed

configure(config)[source]

Configure the module with settings from config.yml.

Called by create_enhancers_from_config() after instantiation. Override in subclasses that accept configuration.

Parameters:

config (dict[str, Any]) – Module-specific configuration dict

abstractmethod async enhance(entry, conn)[source]

Enhance an entry and store results.

Parameters:
  • entry (EnhancedLogbookEntry) – The entry to enhance

  • conn (AsyncConnection) – Database connection from pool

The module should: 1. Extract relevant data from entry 2. Process using configured model/algorithm 3. Store results to appropriate table/column

async health_check()[source]

Check if module is ready.

Returns:

Tuple of (healthy, message)

Return type:

tuple[bool, str]

Configuration#

YAML Reference#

ARIEL is configured under the ariel: key in config.yml. The configuration is parsed into the ARIELConfig dataclass hierarchy.

ariel:
  # --- Database (required) ---
  database:
    uri: postgresql://ariel:ariel@localhost:5432/ariel

  default_max_results: 10     # Default results per search (default: 10)
  cache_embeddings: true      # Cache embeddings in memory (default: true)

  # --- Ingestion ---
  ingestion:
    adapter: generic_json     # als_logbook | jlab_logbook | ornl_logbook | generic_json
    source_url: path/to/logbook.json
    poll_interval_seconds: 3600
    proxy_url: null           # SOCKS proxy (or ARIEL_SOCKS_PROXY env var)
    verify_ssl: false
    chunk_days: 365           # Days per API request window
    request_timeout_seconds: 60
    max_retries: 3
    retry_delay_seconds: 5

  # --- Search Modules (leaf-level search functions) ---
  # provider: references api.providers for credentials
  search_modules:
    keyword:
      enabled: true
    semantic:
      enabled: false          # Requires embedding model
      provider: ollama        # References api.providers.ollama
      model: nomic-embed-text # Which embedding model's table to query
      settings:
        similarity_threshold: 0.7
        embedding_dimension: 768

  # --- Pipelines (compose search modules) ---
  # retrieval_modules: which search modules each pipeline uses
  pipelines:
    rag:
      enabled: true
      retrieval_modules: [keyword]       # Add semantic when ready
    agent:
      enabled: true
      retrieval_modules: [keyword]       # Add semantic when ready

  # --- Enhancement Modules ---
  # Run during ingestion to enrich entries
  enhancement_modules:
    semantic_processor:
      enabled: false
      provider: cborg
      model:
        provider: cborg
        model_id: anthropic/claude-haiku
        max_tokens: 256
    text_embedding:
      enabled: false
      provider: ollama
      models:
        - name: nomic-embed-text
          dimension: 768

  # --- Embedding Provider (fallback) ---
  embedding:
    provider: ollama          # Default provider for modules without explicit provider

  # --- Reasoning (ReAct agent LLM) ---
  # provider: references api.providers for credentials
  reasoning:
    provider: cborg
    model_id: anthropic/claude-haiku
    max_iterations: 5         # Maximum ReAct cycles
    temperature: 0.1
    tool_timeout_seconds: 30  # Per-tool call timeout
    total_timeout_seconds: 120 # Total agent timeout

Config Dataclass Hierarchy#

Dataclass

Description

ARIELConfig

Root configuration; contains all sub-configs

DatabaseConfig

PostgreSQL connection URI

SearchModuleConfig

Per-module: enabled, provider, model, settings

PipelineModuleConfig

Per-pipeline: enabled, retrieval_modules, settings

EnhancementModuleConfig

Per-module: enabled, provider, models list, settings

IngestionConfig

Adapter type, source URL, polling, proxy, retry settings

ReasoningConfig

LLM provider, model, iteration limits, timeouts

EmbeddingConfig

Default embedding provider fallback

ModelConfig

Individual model name, dimension, max_input_tokens

Dataclass API#

class osprey.services.ariel_search.config.ARIELConfig(database, search_modules=<factory>, pipelines=<factory>, enhancement_modules=<factory>, ingestion=None, reasoning=<factory>, embedding=<factory>, default_max_results=10, cache_embeddings=True)[source]

Bases: object

Root configuration for ARIEL service.

database

Database connection configuration

Type:

osprey.services.ariel_search.config.DatabaseConfig

search_modules

Search module configurations by name

Type:

dict[str, osprey.services.ariel_search.config.SearchModuleConfig]

enhancement_modules

Enhancement module configurations by name

Type:

dict[str, osprey.services.ariel_search.config.EnhancementModuleConfig]

ingestion

Ingestion configuration

Type:

osprey.services.ariel_search.config.IngestionConfig | None

reasoning

Agentic reasoning configuration

Type:

osprey.services.ariel_search.config.ReasoningConfig

embedding

Embedding provider configuration

Type:

osprey.services.ariel_search.config.EmbeddingConfig

default_max_results

Default maximum results to return

Type:

int

cache_embeddings

Whether to cache embeddings

Type:

bool

database: DatabaseConfig
search_modules: dict[str, SearchModuleConfig]
pipelines: dict[str, PipelineModuleConfig]
enhancement_modules: dict[str, EnhancementModuleConfig]
ingestion: IngestionConfig | None = None
reasoning: ReasoningConfig
embedding: EmbeddingConfig
default_max_results: int = 10
cache_embeddings: bool = True
classmethod from_dict(config_dict)[source]

Create ARIELConfig from config.yml dictionary.

Parameters:

config_dict (dict[str, Any]) – The ‘ariel’ section from config.yml

Returns:

ARIELConfig instance

Return type:

ARIELConfig

is_search_module_enabled(name)[source]

Check if a search module is enabled.

Parameters:

name (str) – Module name (keyword, semantic)

Returns:

True if the module is enabled

Return type:

bool

get_enabled_search_modules()[source]

Get list of enabled search module names.

Returns:

List of enabled module names

Return type:

list[str]

is_pipeline_enabled(name)[source]

Check if a pipeline is enabled.

If no pipeline config exists for the name, defaults to True (pipelines are always-on unless explicitly disabled).

Parameters:

name (str) – Pipeline name (rag, agent)

Returns:

True if the pipeline is enabled

Return type:

bool

get_enabled_pipelines()[source]

Get list of enabled pipeline names.

Returns configured pipelines that are enabled, plus default pipelines (rag, agent) if not explicitly configured.

Returns:

List of enabled pipeline names

Return type:

list[str]

get_pipeline_retrieval_modules(name)[source]

Get retrieval modules configured for a pipeline.

If the pipeline has no explicit config, falls back to the list of enabled search modules.

Parameters:

name (str) – Pipeline name (rag, agent)

Returns:

List of search module names to use for retrieval

Return type:

list[str]

is_enhancement_module_enabled(name)[source]

Check if an enhancement module is enabled.

Parameters:

name (str) – Module name (text_embedding, semantic_processor, figure_embedding)

Returns:

True if the module is enabled

Return type:

bool

get_enabled_enhancement_modules()[source]

Get list of enabled enhancement module names.

Returns:

List of enabled module names

Return type:

list[str]

validate()[source]

Validate configuration and return list of errors.

Returns:

List of validation error messages (empty if valid)

Return type:

list[str]

get_search_model()[source]

Get the configured search model name.

Returns the model configured for semantic search, which is also used for RAG search when enabled.

Returns:

Model name or None if semantic search is not enabled

Return type:

str | None

get_enhancement_module_config(name)[source]

Get configuration dictionary for an enhancement module.

Returns the raw configuration that can be passed to module.configure().

Parameters:

name (str) – Module name (text_embedding, semantic_processor)

Returns:

Configuration dictionary or None if module not configured

Return type:

dict[str, Any] | None

__init__(database, search_modules=<factory>, pipelines=<factory>, enhancement_modules=<factory>, ingestion=None, reasoning=<factory>, embedding=<factory>, default_max_results=10, cache_embeddings=True)
class osprey.services.ariel_search.config.SearchModuleConfig(enabled, provider=None, model=None, settings=<factory>)[source]

Bases: object

Configuration for a single search module (keyword, semantic).

enabled

Whether module is active

Type:

bool

provider

Provider name for embeddings (references api.providers section)

Type:

str | None

model

Model identifier for semantic modules - which model’s table to query

Type:

str | None

settings

Module-specific settings

Type:

dict[str, Any]

enabled: bool
provider: str | None = None
model: str | None = None
settings: dict[str, Any]
classmethod from_dict(data)[source]

Create SearchModuleConfig from dictionary.

Return type:

SearchModuleConfig

__init__(enabled, provider=None, model=None, settings=<factory>)
class osprey.services.ariel_search.config.PipelineModuleConfig(enabled=True, retrieval_modules=<factory>, settings=<factory>)[source]

Bases: object

Configuration for a pipeline (rag, agent).

Pipelines compose search modules into higher-level execution strategies.

enabled

Whether pipeline is active

Type:

bool

retrieval_modules

Which search modules this pipeline uses

Type:

list[str]

settings

Pipeline-specific settings

Type:

dict[str, Any]

enabled: bool = True
retrieval_modules: list[str]
settings: dict[str, Any]
classmethod from_dict(data)[source]

Create PipelineModuleConfig from dictionary.

Return type:

PipelineModuleConfig

__init__(enabled=True, retrieval_modules=<factory>, settings=<factory>)
class osprey.services.ariel_search.config.EnhancementModuleConfig(enabled, provider=None, models=None, settings=<factory>)[source]

Bases: object

Configuration for a single enhancement module (text_embedding, semantic_processor).

enabled

Whether module is active

Type:

bool

provider

Provider name for embeddings (references api.providers section)

Type:

str | None

models

List of model configurations (for text_embedding)

Type:

list[osprey.services.ariel_search.config.ModelConfig] | None

settings

Module-specific settings

Type:

dict[str, Any]

enabled: bool
provider: str | None = None
models: list[ModelConfig] | None = None
settings: dict[str, Any]
classmethod from_dict(data)[source]

Create EnhancementModuleConfig from dictionary.

Return type:

EnhancementModuleConfig

__init__(enabled, provider=None, models=None, settings=<factory>)
class osprey.services.ariel_search.config.IngestionConfig(adapter, source_url=None, poll_interval_seconds=3600, proxy_url=None, verify_ssl=False, chunk_days=365, request_timeout_seconds=60, max_retries=3, retry_delay_seconds=5, watch=<factory>, write=<factory>)[source]

Bases: object

Configuration for logbook ingestion.

adapter

Adapter name (e.g., “als_logbook”, “generic_json”)

Type:

str

source_url

URL for source system API (optional)

Type:

str | None

poll_interval_seconds

Polling interval for incremental ingestion

Type:

int

proxy_url

SOCKS proxy URL (e.g., “socks5://localhost:9095”)

Type:

str | None

verify_ssl

Whether to verify SSL certificates (default: False for internal servers)

Type:

bool

chunk_days

Days per API request for time windowing (default: 365)

Type:

int

request_timeout_seconds

Timeout for HTTP requests (default: 60)

Type:

int

max_retries

Maximum retry attempts for failed requests (default: 3)

Type:

int

retry_delay_seconds

Base delay between retries (default: 5)

Type:

int

watch

Watch mode configuration

Type:

osprey.services.ariel_search.config.WatchConfig

write

Write operation configuration

Type:

osprey.services.ariel_search.config.WriteConfig

adapter: str
source_url: str | None = None
poll_interval_seconds: int = 3600
proxy_url: str | None = None
verify_ssl: bool = False
chunk_days: int = 365
request_timeout_seconds: int = 60
max_retries: int = 3
retry_delay_seconds: int = 5
watch: WatchConfig
write: WriteConfig
classmethod from_dict(data)[source]

Create IngestionConfig from dictionary.

Return type:

IngestionConfig

__init__(adapter, source_url=None, poll_interval_seconds=3600, proxy_url=None, verify_ssl=False, chunk_days=365, request_timeout_seconds=60, max_retries=3, retry_delay_seconds=5, watch=<factory>, write=<factory>)
class osprey.services.ariel_search.config.ReasoningConfig(provider='openai', model_id='gpt-4o-mini', max_iterations=5, temperature=0.1, tool_timeout_seconds=30, total_timeout_seconds=120)[source]

Bases: object

Configuration for agentic reasoning behavior.

Uses Osprey’s provider configuration system for credentials. The provider field references api.providers for api_key and base_url.

provider

Provider name (references api.providers section)

Type:

str

model_id

LLM model identifier (default: “gpt-4o-mini”)

Type:

str

max_iterations

Maximum ReAct cycles (default: 5)

Type:

int

temperature

LLM temperature (default: 0.1)

Type:

float

tool_timeout_seconds

Per-tool call timeout (default: 30)

Type:

int

total_timeout_seconds

Total agent execution timeout (default: 120)

Type:

int

provider: str = 'openai'
model_id: str = 'gpt-4o-mini'
max_iterations: int = 5
temperature: float = 0.1
tool_timeout_seconds: int = 30
total_timeout_seconds: int = 120
classmethod from_dict(data)[source]

Create ReasoningConfig from dictionary.

Return type:

ReasoningConfig

__init__(provider='openai', model_id='gpt-4o-mini', max_iterations=5, temperature=0.1, tool_timeout_seconds=30, total_timeout_seconds=120)
class osprey.services.ariel_search.config.EmbeddingConfig(provider='ollama')[source]

Bases: object

Configuration for embedding generation.

provider

Provider name (uses central Osprey config)

Type:

str

provider: str = 'ollama'
classmethod from_dict(data)[source]

Create EmbeddingConfig from dictionary.

Return type:

EmbeddingConfig

__init__(provider='ollama')
class osprey.services.ariel_search.config.DatabaseConfig(uri)[source]

Bases: object

Configuration for ARIEL database connection.

uri

PostgreSQL connection URI (e.g., “postgresql://localhost:5432/ariel”)

Type:

str

uri: str
classmethod from_dict(data)[source]

Create DatabaseConfig from dictionary.

Return type:

DatabaseConfig

__init__(uri)
class osprey.services.ariel_search.config.ModelConfig(name, dimension, max_input_tokens=None)[source]

Bases: object

Configuration for a single embedding model.

Used in text_embedding enhancement to specify which models to embed with during ingestion.

name

Model name (e.g., “nomic-embed-text”)

Type:

str

dimension

Embedding dimension (must match model output)

Type:

int

max_input_tokens

Maximum input tokens for the model (optional)

Type:

int | None

name: str
dimension: int
max_input_tokens: int | None = None
classmethod from_dict(data)[source]

Create ModelConfig from dictionary.

Return type:

ModelConfig

__init__(name, dimension, max_input_tokens=None)

See also

Search Modes

Search modules, pipelines, and registration guide

Data Ingestion

Ingestion adapters, enhancement modules, and database schema