The
search_context_size parameter allows you to control how much search context is retrieved from the web during query resolution, letting you balance cost and comprehensiveness.- Default
search_context_sizeislow - Selecting
"high"increases search costs due to more extensive web retrieval. Use"low"when cost efficiency is critical.
Overview
Thesearch_context_size field—passed via the web_search_options object—determines how much search context is retrieved by the Sonar models. This setting can help you optimize for either:
- Cost savings with minimal search input (
low) - Comprehensive answers by maximizing retrieved information (
high) - A balance of both (
medium)
Best Practices
Choosing the Right Context Sizelow: Best for short factual queries or when operating under strict token cost constraints.medium: The default and best suited for general use cases.high: Use for deep research, exploratory questions, or when citations and evidence coverage are critical.
- Selecting
lowormediumcan significantly reduce overall token usage, especially at scale. - Consider defaulting to
lowfor high-volume endpoints and selectively upgrading tohighfor complex user prompts.
- You can use
search_context_sizealongside other features likesearch_domain_filterto further control the scope of search. - Combining
mediumwith a focused domain filter often gives a good tradeoff between quality and cost.
- Larger context sizes may slightly increase response latency due to more extensive search and reranking.
- If you’re batching queries or supporting real-time interfaces, test with different settings to balance user experience and runtime.
Examples
1. Minimal Search Context (“low”) This option limits the search context retrieved for the model, reducing cost per request while still producing useful responses for simpler questions.Request
low when cost optimization is more important than answer completeness.
2. Comprehensive Search Context (“high”)
This option maximizes the amount of search context used to answer the question, resulting in more thorough and nuanced responses.
Request
high for research-heavy or nuanced queries where coverage matters more than cost.
⸻