Skip to main content

Streaming Responses

Streaming allows you to receive partial responses from the Perplexity API as they are generated, rather than waiting for the complete response. This is particularly useful for real-time user experiences, long responses, and interactive applications.
Streaming is supported across all models available through the Agentic Research API.
To enable streaming, set stream=True (Python) or stream: true (TypeScript) when creating responses:
from perplexity import Perplexity

client = Perplexity()

# Create streaming response
stream = client.responses.create(
    preset="fast-search",
    input="What is the latest in AI research?",
    stream=True
)

# Process streaming response
for event in stream:
    if event.type == "response.output_text.delta":
        print(event.delta, end="")
    elif event.type == "response.completed":
        print(f"\n\nCompleted: {event.response.usage}")

Error Handling

Handle errors gracefully during streaming:
import perplexity
from perplexity import Perplexity

client = Perplexity()

try:
    stream = client.responses.create(
        preset="fast-search",
        input="Explain machine learning concepts",
        stream=True
    )
    
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="")
        elif event.type == "response.completed":
            print(f"\n\nCompleted: {event.response.usage}")
            
except perplexity.APIConnectionError as e:
    print(f"Network connection failed: {e}")
except perplexity.RateLimitError as e:
    print(f"Rate limit exceeded, please retry later: {e}")
except perplexity.APIStatusError as e:
    print(f"API error {e.status_code}: {e.response}")
If you need search results immediately for your user interface, consider using non-streaming requests for use cases where search result display is critical to the real-time user experience.

Structured Outputs

Structured outputs enable you to enforce specific response formats from Perplexity’s models, ensuring consistent, machine-readable data that can be directly integrated into your applications without manual parsing. We currently support JSON Schema structured outputs. To enable structured outputs, add a response_format field to your request:
{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "your_schema_name",
      "schema": { /* your JSON schema object */ }
    }
  }
}
The name field is required and must be 1-64 alphanumeric characters. The schema should be a valid JSON schema object. LLM responses will match the specified format unless the output exceeds max_tokens.
Improve Schema Compliance: Give the LLM some hints about the output format in your prompts to improve adherence to the structured format. For example, include phrases like “Please return the data as a JSON object with the following structure…” or “Extract the information and format it as specified in the schema.”
The first request with a new JSON Schema expects to incur delay on the first token. Typically, it takes 10 to 30 seconds to prepare the new schema, and may result in timeout errors. Once the schema has been prepared, the subsequent requests will not see such delay.

Example

from perplexity import Perplexity
from typing import List, Optional
from pydantic import BaseModel

class FinancialMetrics(BaseModel):
    company: str
    quarter: str
    revenue: float
    net_income: float
    eps: float
    revenue_growth_yoy: Optional[float] = None
    key_highlights: Optional[List[str]] = None

client = Perplexity()

response = client.responses.create(
    preset="pro-search",
    input="Analyze the latest quarterly earnings report for Apple Inc. Extract key financial metrics.",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "financial_metrics",
            "schema": FinancialMetrics.model_json_schema()
        }
    }
)

metrics = FinancialMetrics.model_validate_json(response.output_text)
print(f"Revenue: ${metrics.revenue}B")
Links in JSON Responses: Requesting links as part of a JSON response may not always work reliably and can result in hallucinations or broken links. Models may generate invalid URLs when forced to include links directly in structured outputs.To ensure all links are valid, use the links returned in the citations or search_results fields from the API response. Never count on the model to return valid links directly as part of the JSON response content.

Next Steps