Overview
The Sonar API provides powerful features for building production-ready applications. This guide covers three core capabilities: streaming responses for real-time experiences, structured outputs for consistent data formats, and effective prompting strategies for web search models.Streaming Responses
Streaming allows you to receive partial responses from the Sonar API as they are generated, rather than waiting for the complete response. This is particularly useful for real-time user experiences, long responses, and interactive applications.Streaming is supported across all Sonar models.
How Streaming Works
When streaming, you receive:- Content chunks which arrive progressively in real-time
- Search results (delivered in the final chunk(s))
- Usage stats and other metadata
Example
Structured Outputs
Structured outputs enable you to enforce specific response formats from Perplexity’s models, ensuring consistent, machine-readable data that can be directly integrated into your applications without manual parsing. We support JSON Schema structured outputs. To enable structured outputs, add aresponse_format field to your request with the following structure:
The first request with a new JSON Schema may incur a delay on the first token (typically 10-30 seconds) as the schema is prepared. Subsequent requests will not see this delay.
Example: Financial Analysis
Prompting Best Practices
Sonar models combine the capabilities of LLMs with real-time web searches. Understanding how they differ from traditional LLMs will help you craft more effective prompts.System and User Prompts
System Prompt: Use the system prompt (role: "system") to provide instructions related to style, tone, and language of the response.
User Prompt: Use the user prompt (role: "user") to pass in the actual query. The user prompt will be used to kick off a real-time web search to ensure the answer has the latest and most relevant information.
Best Practices
Be Specific and Contextual
Sonar models require specificity to retrieve relevant search results. Adding just 2-3 extra words of context can dramatically improve performance.Good: “Explain recent advances in climate prediction models for urban planning”Poor: “Tell me about climate models”
Avoid Few-Shot Prompting
Few-shot prompting confuses web search models by triggering searches for your examples rather than your actual query.Good: “Summarize the current research on mRNA vaccine technology”Poor: “Here’s an example of a good summary: [example]. Now summarize mRNA vaccines.”
Use Built-in Search Parameters
Always use Perplexity’s built-in search parameters (like
search_domain_filter) instead of trying to control search behavior through prompts. API parameters are guaranteed to work and are much more effective.