ARIEL Search Service#
ARIEL (Agentic Retrieval Interface for Electronic Logbooks) is a logbook search service built on Osprey’s registry-based plugin architecture. It provides multiple search strategies — keyword, semantic, RAG, and agentic — through pluggable search modules that are discovered and composed at runtime. A capabilities-driven ingestion pipeline with registry-discovered adapters and enhancement modules keeps the search index up to date, while a deterministic execution pipeline coordinates retrieval, fusion, and answer generation for each query.
See also
- Logbook Search Service (ARIEL)
Architecture overview, design rationale, and getting started
- Search Modes
Search module implementation, pipeline registration, and mode selection
- Data Ingestion
Ingestion adapters, enhancement modules, scheduling, and database schema
- Osprey Integration
Capability, context flow, prompt builder, and error classification
- Web Interface
Web interface architecture, REST API, and capabilities API
Data Models#
- class osprey.services.ariel_search.models.EnhancedLogbookEntry[source]
Bases:
TypedDictARIEL’s enriched logbook entry - the core data model.
Core fields are always present. Enhancement fields are added by enabled enhancement modules during ingestion.
- entry_id: str
- source_system: str
- timestamp: datetime
- author: str
- raw_text: str
- attachments: list[AttachmentInfo]
- metadata: dict[str, Any]
- created_at: datetime
- updated_at: datetime
- summary: NotRequired[str | None]
- keywords: NotRequired[list[str]]
- enhancement_status: NotRequired[dict[str, Any]]
- class osprey.services.ariel_search.models.ARIELSearchRequest(query, modes=<factory>, time_range=None, facility=None, max_results=10, include_images=False, capability_context_data=<factory>, advanced_params=<factory>)[source]
Bases:
objectRequest model for ARIEL search service.
Captures all information needed to execute a search workflow.
- query
The search query text
- Type:
str
- modes
Search modes to use (default: [RAG])
- Type:
list[osprey.services.ariel_search.models.SearchMode]
- time_range
Default time range filter (see Time Range Semantics)
- Type:
tuple[datetime.datetime, datetime.datetime] | None
- facility
Facility filter
- Type:
str | None
- max_results
Maximum results to return (default: 10, range: 1-100)
- Type:
int
- include_images
Include image attachments (default: False)
- Type:
bool
- capability_context_data
Context from main graph state
- Type:
dict[str, Any]
- query: str
- modes: list[SearchMode]
- time_range: tuple[datetime, datetime] | None = None
- facility: str | None = None
- max_results: int = 10
- include_images: bool = False
- capability_context_data: dict[str, Any]
- advanced_params: dict[str, Any]
- __post_init__()[source]
Validate request fields.
- __init__(query, modes=<factory>, time_range=None, facility=None, max_results=10, include_images=False, capability_context_data=<factory>, advanced_params=<factory>)
- class osprey.services.ariel_search.models.ARIELSearchResult(entries, answer=None, sources=<factory>, search_modes_used=<factory>, reasoning='', diagnostics=<factory>, pipeline_details=None)[source]
Bases:
objectStructured, type-safe result from ARIEL search service.
Frozen dataclass for immutability.
- entries
Matching entries, ranked by relevance
- Type:
tuple[dict[str, Any], …]
- answer
RAG-generated answer (optional)
- Type:
str | None
- sources
Entry IDs used as sources
- Type:
tuple[str, …]
- search_modes_used
Modes that were executed
- Type:
tuple[osprey.services.ariel_search.models.SearchMode, …]
- reasoning
Explanation of results
- Type:
str
- entries: tuple[dict[str, Any], ...]
- answer: str | None = None
- sources: tuple[str, ...]
- search_modes_used: tuple[SearchMode, ...]
- reasoning: str = ''
- diagnostics: tuple[SearchDiagnostic, ...]
- pipeline_details: PipelineDetails | None = None
- __init__(entries, answer=None, sources=<factory>, search_modes_used=<factory>, reasoning='', diagnostics=<factory>, pipeline_details=None)
- class osprey.services.ariel_search.models.SearchMode(value)[source]
Bases:
EnumSearch mode enumeration.
- KEYWORD
PostgreSQL full-text search (direct function call)
- SEMANTIC
Embedding similarity search (direct function call)
- RAG
Deterministic RAG pipeline with hybrid retrieval, RRF fusion, and LLM generation
- AGENT
Agentic orchestration with ReAct agent (AgentExecutor)
- KEYWORD = 'keyword'
- SEMANTIC = 'semantic'
- RAG = 'rag'
- AGENT = 'agent'
Search Module Interface#
- class osprey.services.ariel_search.search.base.SearchToolDescriptor(name, description, search_mode, args_schema, execute, format_result, needs_embedder=False)[source]
Bases:
objectEverything the agent executor needs to wrap a search module as a tool.
- name
Tool name for LangChain (e.g. “keyword_search”)
- Type:
str
- description
Tool description shown to the LLM
- Type:
str
- search_mode
Corresponding SearchMode enum value
- Type:
SearchMode
- args_schema
Pydantic model for tool input validation
- Type:
type[BaseModel]
- execute
Async function that performs the search
- Type:
Callable[…, Awaitable[Any]]
- format_result
Formats raw search results for the agent
- Type:
Callable[…, dict[str, Any]]
- needs_embedder
Whether this tool requires an embedding provider
- Type:
bool
- name: str
- description: str
- search_mode: SearchMode
- args_schema: type[BaseModel]
- execute: Callable[..., Awaitable[Any]]
- format_result: Callable[..., dict[str, Any]]
- needs_embedder: bool = False
- __init__(name, description, search_mode, args_schema, execute, format_result, needs_embedder=False)
- class osprey.services.ariel_search.search.base.ParameterDescriptor(name, label, description, param_type, default, min_value=None, max_value=None, step=None, options=None, section='General', placeholder=None, options_endpoint=None)[source]
Bases:
objectDescribes a tunable parameter for the frontend capabilities API.
- name
Parameter key (e.g. “similarity_threshold”)
- Type:
str
- label
Human-readable label (e.g. “Similarity Threshold”)
- Type:
str
- description
Help text for the parameter
- Type:
str
- param_type
One of “float”, “int”, “bool”, “select”, “date”, “text”, “dynamic_select”
- Type:
str
- default
Default value
- Type:
Any
- min_value
Minimum value (float/int types)
- Type:
float | None
- max_value
Maximum value (float/int types)
- Type:
float | None
- step
Step increment (float/int types)
- Type:
float | None
- options
Choices for select type, e.g. [{“value”: “rrf”, “label”: “RRF”}]
- Type:
list[dict[str, str]] | None
- section
Grouping label in the advanced panel (e.g. “Retrieval”)
- Type:
str
- placeholder
Placeholder text for text inputs
- Type:
str | None
- options_endpoint
API endpoint for dynamic_select to fetch options
- Type:
str | None
- name: str
- label: str
- description: str
- param_type: str
- default: Any
- min_value: float | None = None
- max_value: float | None = None
- step: float | None = None
- options: list[dict[str, str]] | None = None
- section: str = 'General'
- placeholder: str | None = None
- options_endpoint: str | None = None
- to_dict()[source]
Serialize to a JSON-friendly dict.
- Return type:
dict[str, Any]
- __init__(name, label, description, param_type, default, min_value=None, max_value=None, step=None, options=None, section='General', placeholder=None, options_endpoint=None)
Keyword Search Module#
- osprey.services.ariel_search.search.keyword.get_tool_descriptor()[source]#
Return the descriptor for auto-discovery by the agent executor.
- Return type:
SearchToolDescriptor
- osprey.services.ariel_search.search.keyword.get_parameter_descriptors()[source]#
Return tunable parameter descriptors for the capabilities API.
- Return type:
list[ParameterDescriptor]
- async osprey.services.ariel_search.search.keyword.keyword_search(query, repository, config, *, max_results=10, start_date=None, end_date=None, author=None, source_system=None, include_highlights=True, fuzzy_fallback=True, **kwargs)[source]#
Execute keyword search against the logbook database.
Uses PostgreSQL full-text search with optional fuzzy matching fallback.
- Parameters:
query (str) – Search query with optional operators (AND, OR, NOT) and field prefixes (author:, date:)
repository (ARIELRepository) – ARIEL database repository
config (ARIELConfig) – ARIEL configuration
max_results (int) – Maximum entries to return (default: 10)
start_date (datetime | None) – Filter entries after this time
end_date (datetime | None) – Filter entries before this time
author (str | None) – Filter by author name (ILIKE match)
source_system (str | None) – Filter by source system (exact match)
include_highlights (bool) – Include highlighted snippets (default: True)
fuzzy_fallback (bool) – Fall back to fuzzy search if no exact matches
- Returns:
List of (entry, score, highlights) tuples sorted by relevance
- Return type:
list[tuple[EnhancedLogbookEntry, float, list[str]]]
- osprey.services.ariel_search.search.keyword.format_keyword_result(entry, score, highlights)[source]#
Format a keyword search result for agent consumption.
- Parameters:
entry (EnhancedLogbookEntry) – EnhancedLogbookEntry
score (float) – Relevance score
highlights (list[str]) – Highlighted snippets
- Returns:
Formatted dict for agent
- Return type:
dict[str, Any]
- class osprey.services.ariel_search.search.keyword.KeywordSearchInput(*, query, max_results=10, start_date=None, end_date=None)[source]
Bases:
BaseModelInput schema for keyword search tool.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- query: str
- max_results: int
- start_date: datetime | None
- end_date: datetime | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Semantic Search Module#
- osprey.services.ariel_search.search.semantic.get_tool_descriptor()[source]#
Return the descriptor for auto-discovery by the agent executor.
- Return type:
SearchToolDescriptor
- osprey.services.ariel_search.search.semantic.get_parameter_descriptors()[source]#
Return tunable parameter descriptors for the capabilities API.
- Return type:
list[ParameterDescriptor]
- async osprey.services.ariel_search.search.semantic.semantic_search(query, repository, config, embedder, *, max_results=10, similarity_threshold=None, start_date=None, end_date=None, author=None, source_system=None, **kwargs)[source]#
Execute semantic similarity search.
Generates an embedding for the query and finds similar entries using cosine similarity.
- Parameters:
query (str) – Natural language query
repository (ARIELRepository) – ARIEL database repository
config (ARIELConfig) – ARIEL configuration
embedder (BaseEmbeddingProvider) – Embedding provider (Ollama or other)
max_results (int) – Maximum entries to return (default: 10)
similarity_threshold (float | None) – Minimum similarity score (default: 0.5). Can be overridden per-query, then falls back to config, then to hardcoded default.
start_date (datetime | None) – Filter entries after this time
end_date (datetime | None) – Filter entries before this time
author (str | None) – Filter by author name (ILIKE match)
source_system (str | None) – Filter by source system (exact match)
- Returns:
List of (entry, similarity_score) tuples sorted by similarity
- Return type:
list[tuple[EnhancedLogbookEntry, float]]
- osprey.services.ariel_search.search.semantic.format_semantic_result(entry, similarity)[source]#
Format a semantic search result for agent consumption.
- Parameters:
entry (EnhancedLogbookEntry) – EnhancedLogbookEntry
similarity (float) – Cosine similarity score
- Returns:
Formatted dict for agent
- Return type:
dict[str, Any]
- class osprey.services.ariel_search.search.semantic.SemanticSearchInput(*, query, max_results=10, similarity_threshold=0.5, start_date=None, end_date=None)[source]
Bases:
BaseModelInput schema for semantic search tool.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- query: str
- max_results: int
- similarity_threshold: float
- start_date: datetime | None
- end_date: datetime | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Pipeline Interface#
- class osprey.services.ariel_search.pipelines.PipelineDescriptor(name, label, description, category, parameters=<factory>)[source]
Bases:
objectMetadata for a pipeline execution strategy.
- name
Pipeline key (e.g. “rag”)
- Type:
str
- label
Human-readable label (e.g. “RAG”)
- Type:
str
- description
What this pipeline does
- Type:
str
- category
“llm” or “direct”
- Type:
str
- parameters
Tunable parameters for this pipeline
- Type:
list[osprey.services.ariel_search.search.base.ParameterDescriptor]
- name: str
- label: str
- description: str
- category: str
- parameters: list[ParameterDescriptor]
- __init__(name, label, description, category, parameters=<factory>)
- osprey.services.ariel_search.pipelines.get_pipeline_descriptor(name)[source]#
Return a single pipeline descriptor by name.
Used by the central Osprey registry to look up specific pipelines.
- Parameters:
name (str) – Pipeline name (e.g. “rag”, “agent”)
- Returns:
The matching PipelineDescriptor
- Raises:
KeyError – If no pipeline with the given name exists
- Return type:
PipelineDescriptor
Ingestion Interface#
- osprey.services.ariel_search.ingestion.base.BaseAdapter
alias of
FacilityAdapter
Enhancement Interface#
- class osprey.services.ariel_search.enhancement.base.BaseEnhancementModule[source]
Bases:
ABCAbstract base class for enhancement modules.
Enhancement modules enrich logbook entries during ingestion. They run sequentially as a pipeline, each adding data to the entry.
- abstract property name: str
Return module identifier.
- Returns:
Module name (e.g., ‘text_embedding’, ‘semantic_processor’)
- property migration: type[BaseMigration] | None
Return migration class for this module.
Override in subclasses that need database migrations.
- Returns:
Migration class or None if no migration needed
- configure(config)[source]
Configure the module with settings from config.yml.
Called by create_enhancers_from_config() after instantiation. Override in subclasses that accept configuration.
- Parameters:
config (dict[str, Any]) – Module-specific configuration dict
- abstractmethod async enhance(entry, conn)[source]
Enhance an entry and store results.
- Parameters:
entry (EnhancedLogbookEntry) – The entry to enhance
conn (AsyncConnection) – Database connection from pool
The module should: 1. Extract relevant data from entry 2. Process using configured model/algorithm 3. Store results to appropriate table/column
- async health_check()[source]
Check if module is ready.
- Returns:
Tuple of (healthy, message)
- Return type:
tuple[bool, str]
Configuration#
YAML Reference#
ARIEL is configured under the ariel: key in config.yml. The configuration is parsed into the ARIELConfig dataclass hierarchy.
ariel:
# --- Database (required) ---
database:
uri: postgresql://ariel:ariel@localhost:5432/ariel
default_max_results: 10 # Default results per search (default: 10)
cache_embeddings: true # Cache embeddings in memory (default: true)
# --- Ingestion ---
ingestion:
adapter: generic_json # als_logbook | jlab_logbook | ornl_logbook | generic_json
source_url: path/to/logbook.json
poll_interval_seconds: 3600
proxy_url: null # SOCKS proxy (or ARIEL_SOCKS_PROXY env var)
verify_ssl: false
chunk_days: 365 # Days per API request window
request_timeout_seconds: 60
max_retries: 3
retry_delay_seconds: 5
# --- Search Modules (leaf-level search functions) ---
# provider: references api.providers for credentials
search_modules:
keyword:
enabled: true
semantic:
enabled: false # Requires embedding model
provider: ollama # References api.providers.ollama
model: nomic-embed-text # Which embedding model's table to query
settings:
similarity_threshold: 0.7
embedding_dimension: 768
# --- Pipelines (compose search modules) ---
# retrieval_modules: which search modules each pipeline uses
pipelines:
rag:
enabled: true
retrieval_modules: [keyword] # Add semantic when ready
agent:
enabled: true
retrieval_modules: [keyword] # Add semantic when ready
# --- Enhancement Modules ---
# Run during ingestion to enrich entries
enhancement_modules:
semantic_processor:
enabled: false
provider: cborg
model:
provider: cborg
model_id: anthropic/claude-haiku
max_tokens: 256
text_embedding:
enabled: false
provider: ollama
models:
- name: nomic-embed-text
dimension: 768
# --- Embedding Provider (fallback) ---
embedding:
provider: ollama # Default provider for modules without explicit provider
# --- Reasoning (ReAct agent LLM) ---
# provider: references api.providers for credentials
reasoning:
provider: cborg
model_id: anthropic/claude-haiku
max_iterations: 5 # Maximum ReAct cycles
temperature: 0.1
tool_timeout_seconds: 30 # Per-tool call timeout
total_timeout_seconds: 120 # Total agent timeout
Config Dataclass Hierarchy#
Dataclass |
Description |
|---|---|
|
Root configuration; contains all sub-configs |
|
PostgreSQL connection URI |
|
Per-module: enabled, provider, model, settings |
|
Per-pipeline: enabled, retrieval_modules, settings |
|
Per-module: enabled, provider, models list, settings |
|
Adapter type, source URL, polling, proxy, retry settings |
|
LLM provider, model, iteration limits, timeouts |
|
Default embedding provider fallback |
|
Individual model name, dimension, max_input_tokens |
Dataclass API#
- class osprey.services.ariel_search.config.ARIELConfig(database, search_modules=<factory>, pipelines=<factory>, enhancement_modules=<factory>, ingestion=None, reasoning=<factory>, embedding=<factory>, default_max_results=10, cache_embeddings=True)[source]
Bases:
objectRoot configuration for ARIEL service.
- database
Database connection configuration
- Type:
osprey.services.ariel_search.config.DatabaseConfig
- search_modules
Search module configurations by name
- Type:
dict[str, osprey.services.ariel_search.config.SearchModuleConfig]
- enhancement_modules
Enhancement module configurations by name
- Type:
dict[str, osprey.services.ariel_search.config.EnhancementModuleConfig]
- ingestion
Ingestion configuration
- Type:
osprey.services.ariel_search.config.IngestionConfig | None
- reasoning
Agentic reasoning configuration
- Type:
osprey.services.ariel_search.config.ReasoningConfig
- embedding
Embedding provider configuration
- Type:
osprey.services.ariel_search.config.EmbeddingConfig
- default_max_results
Default maximum results to return
- Type:
int
- cache_embeddings
Whether to cache embeddings
- Type:
bool
- database: DatabaseConfig
- search_modules: dict[str, SearchModuleConfig]
- pipelines: dict[str, PipelineModuleConfig]
- enhancement_modules: dict[str, EnhancementModuleConfig]
- ingestion: IngestionConfig | None = None
- reasoning: ReasoningConfig
- embedding: EmbeddingConfig
- default_max_results: int = 10
- cache_embeddings: bool = True
- classmethod from_dict(config_dict)[source]
Create ARIELConfig from config.yml dictionary.
- Parameters:
config_dict (dict[str, Any]) – The ‘ariel’ section from config.yml
- Returns:
ARIELConfig instance
- Return type:
ARIELConfig
- is_search_module_enabled(name)[source]
Check if a search module is enabled.
- Parameters:
name (str) – Module name (keyword, semantic)
- Returns:
True if the module is enabled
- Return type:
bool
- get_enabled_search_modules()[source]
Get list of enabled search module names.
- Returns:
List of enabled module names
- Return type:
list[str]
- is_pipeline_enabled(name)[source]
Check if a pipeline is enabled.
If no pipeline config exists for the name, defaults to True (pipelines are always-on unless explicitly disabled).
- Parameters:
name (str) – Pipeline name (rag, agent)
- Returns:
True if the pipeline is enabled
- Return type:
bool
- get_enabled_pipelines()[source]
Get list of enabled pipeline names.
Returns configured pipelines that are enabled, plus default pipelines (rag, agent) if not explicitly configured.
- Returns:
List of enabled pipeline names
- Return type:
list[str]
- get_pipeline_retrieval_modules(name)[source]
Get retrieval modules configured for a pipeline.
If the pipeline has no explicit config, falls back to the list of enabled search modules.
- Parameters:
name (str) – Pipeline name (rag, agent)
- Returns:
List of search module names to use for retrieval
- Return type:
list[str]
- is_enhancement_module_enabled(name)[source]
Check if an enhancement module is enabled.
- Parameters:
name (str) – Module name (text_embedding, semantic_processor, figure_embedding)
- Returns:
True if the module is enabled
- Return type:
bool
- get_enabled_enhancement_modules()[source]
Get list of enabled enhancement module names.
- Returns:
List of enabled module names
- Return type:
list[str]
- validate()[source]
Validate configuration and return list of errors.
- Returns:
List of validation error messages (empty if valid)
- Return type:
list[str]
- get_search_model()[source]
Get the configured search model name.
Returns the model configured for semantic search, which is also used for RAG search when enabled.
- Returns:
Model name or None if semantic search is not enabled
- Return type:
str | None
- get_enhancement_module_config(name)[source]
Get configuration dictionary for an enhancement module.
Returns the raw configuration that can be passed to module.configure().
- Parameters:
name (str) – Module name (text_embedding, semantic_processor)
- Returns:
Configuration dictionary or None if module not configured
- Return type:
dict[str, Any] | None
- __init__(database, search_modules=<factory>, pipelines=<factory>, enhancement_modules=<factory>, ingestion=None, reasoning=<factory>, embedding=<factory>, default_max_results=10, cache_embeddings=True)
- class osprey.services.ariel_search.config.SearchModuleConfig(enabled, provider=None, model=None, settings=<factory>)[source]
Bases:
objectConfiguration for a single search module (keyword, semantic).
- enabled
Whether module is active
- Type:
bool
- provider
Provider name for embeddings (references api.providers section)
- Type:
str | None
- model
Model identifier for semantic modules - which model’s table to query
- Type:
str | None
- settings
Module-specific settings
- Type:
dict[str, Any]
- enabled: bool
- provider: str | None = None
- model: str | None = None
- settings: dict[str, Any]
- classmethod from_dict(data)[source]
Create SearchModuleConfig from dictionary.
- Return type:
SearchModuleConfig
- __init__(enabled, provider=None, model=None, settings=<factory>)
- class osprey.services.ariel_search.config.PipelineModuleConfig(enabled=True, retrieval_modules=<factory>, settings=<factory>)[source]
Bases:
objectConfiguration for a pipeline (rag, agent).
Pipelines compose search modules into higher-level execution strategies.
- enabled
Whether pipeline is active
- Type:
bool
- retrieval_modules
Which search modules this pipeline uses
- Type:
list[str]
- settings
Pipeline-specific settings
- Type:
dict[str, Any]
- enabled: bool = True
- retrieval_modules: list[str]
- settings: dict[str, Any]
- classmethod from_dict(data)[source]
Create PipelineModuleConfig from dictionary.
- Return type:
PipelineModuleConfig
- __init__(enabled=True, retrieval_modules=<factory>, settings=<factory>)
- class osprey.services.ariel_search.config.EnhancementModuleConfig(enabled, provider=None, models=None, settings=<factory>)[source]
Bases:
objectConfiguration for a single enhancement module (text_embedding, semantic_processor).
- enabled
Whether module is active
- Type:
bool
- provider
Provider name for embeddings (references api.providers section)
- Type:
str | None
- models
List of model configurations (for text_embedding)
- Type:
list[osprey.services.ariel_search.config.ModelConfig] | None
- settings
Module-specific settings
- Type:
dict[str, Any]
- enabled: bool
- provider: str | None = None
- models: list[ModelConfig] | None = None
- settings: dict[str, Any]
- classmethod from_dict(data)[source]
Create EnhancementModuleConfig from dictionary.
- Return type:
EnhancementModuleConfig
- __init__(enabled, provider=None, models=None, settings=<factory>)
- class osprey.services.ariel_search.config.IngestionConfig(adapter, source_url=None, poll_interval_seconds=3600, proxy_url=None, verify_ssl=False, chunk_days=365, request_timeout_seconds=60, max_retries=3, retry_delay_seconds=5, watch=<factory>, write=<factory>)[source]
Bases:
objectConfiguration for logbook ingestion.
- adapter
Adapter name (e.g., “als_logbook”, “generic_json”)
- Type:
str
- source_url
URL for source system API (optional)
- Type:
str | None
- poll_interval_seconds
Polling interval for incremental ingestion
- Type:
int
- proxy_url
SOCKS proxy URL (e.g., “socks5://localhost:9095”)
- Type:
str | None
- verify_ssl
Whether to verify SSL certificates (default: False for internal servers)
- Type:
bool
- chunk_days
Days per API request for time windowing (default: 365)
- Type:
int
- request_timeout_seconds
Timeout for HTTP requests (default: 60)
- Type:
int
- max_retries
Maximum retry attempts for failed requests (default: 3)
- Type:
int
- retry_delay_seconds
Base delay between retries (default: 5)
- Type:
int
- watch
Watch mode configuration
- Type:
osprey.services.ariel_search.config.WatchConfig
- write
Write operation configuration
- Type:
osprey.services.ariel_search.config.WriteConfig
- adapter: str
- source_url: str | None = None
- poll_interval_seconds: int = 3600
- proxy_url: str | None = None
- verify_ssl: bool = False
- chunk_days: int = 365
- request_timeout_seconds: int = 60
- max_retries: int = 3
- retry_delay_seconds: int = 5
- watch: WatchConfig
- write: WriteConfig
- classmethod from_dict(data)[source]
Create IngestionConfig from dictionary.
- Return type:
IngestionConfig
- __init__(adapter, source_url=None, poll_interval_seconds=3600, proxy_url=None, verify_ssl=False, chunk_days=365, request_timeout_seconds=60, max_retries=3, retry_delay_seconds=5, watch=<factory>, write=<factory>)
- class osprey.services.ariel_search.config.ReasoningConfig(provider='openai', model_id='gpt-4o-mini', max_iterations=5, temperature=0.1, tool_timeout_seconds=30, total_timeout_seconds=120)[source]
Bases:
objectConfiguration for agentic reasoning behavior.
Uses Osprey’s provider configuration system for credentials. The provider field references api.providers for api_key and base_url.
- provider
Provider name (references api.providers section)
- Type:
str
- model_id
LLM model identifier (default: “gpt-4o-mini”)
- Type:
str
- max_iterations
Maximum ReAct cycles (default: 5)
- Type:
int
- temperature
LLM temperature (default: 0.1)
- Type:
float
- tool_timeout_seconds
Per-tool call timeout (default: 30)
- Type:
int
- total_timeout_seconds
Total agent execution timeout (default: 120)
- Type:
int
- provider: str = 'openai'
- model_id: str = 'gpt-4o-mini'
- max_iterations: int = 5
- temperature: float = 0.1
- tool_timeout_seconds: int = 30
- total_timeout_seconds: int = 120
- classmethod from_dict(data)[source]
Create ReasoningConfig from dictionary.
- Return type:
ReasoningConfig
- __init__(provider='openai', model_id='gpt-4o-mini', max_iterations=5, temperature=0.1, tool_timeout_seconds=30, total_timeout_seconds=120)
- class osprey.services.ariel_search.config.EmbeddingConfig(provider='ollama')[source]
Bases:
objectConfiguration for embedding generation.
- provider
Provider name (uses central Osprey config)
- Type:
str
- provider: str = 'ollama'
- classmethod from_dict(data)[source]
Create EmbeddingConfig from dictionary.
- Return type:
EmbeddingConfig
- __init__(provider='ollama')
- class osprey.services.ariel_search.config.DatabaseConfig(uri)[source]
Bases:
objectConfiguration for ARIEL database connection.
- uri
PostgreSQL connection URI (e.g., “postgresql://localhost:5432/ariel”)
- Type:
str
- uri: str
- classmethod from_dict(data)[source]
Create DatabaseConfig from dictionary.
- Return type:
DatabaseConfig
- __init__(uri)
- class osprey.services.ariel_search.config.ModelConfig(name, dimension, max_input_tokens=None)[source]
Bases:
objectConfiguration for a single embedding model.
Used in text_embedding enhancement to specify which models to embed with during ingestion.
- name
Model name (e.g., “nomic-embed-text”)
- Type:
str
- dimension
Embedding dimension (must match model output)
- Type:
int
- max_input_tokens
Maximum input tokens for the model (optional)
- Type:
int | None
- name: str
- dimension: int
- max_input_tokens: int | None = None
- classmethod from_dict(data)[source]
Create ModelConfig from dictionary.
- Return type:
ModelConfig
- __init__(name, dimension, max_input_tokens=None)
See also
- Search Modes
Search modules, pipelines, and registration guide
- Data Ingestion
Ingestion adapters, enhancement modules, and database schema