Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The name of the model that will complete your prompt. Choose from our available Sonar models: sonar (lightweight search), sonar-pro (advanced search), sonar-deep-research (exhaustive research), sonar-reasoning (fast reasoning), or sonar-reasoning-pro (premier reasoning).
sonar
, sonar-pro
, sonar-deep-research
, sonar-reasoning
, sonar-reasoning-pro
"sonar"
A list of messages comprising the conversation so far.
[
{
"role": "system",
"content": "Be precise and concise."
},
{
"role": "user",
"content": "How many stars are there in our galaxy?"
}
]
Perplexity-Specific: Controls how much computational effort the AI dedicates to each query for deep research models. 'low' provides faster, simpler answers with reduced token usage, 'medium' offers a balanced approach, and 'high' delivers deeper, more thorough responses with increased token usage. This parameter directly impacts the amount of reasoning tokens consumed. WARNING: This parameter is ONLY applicable for sonar-deep-research. Defaults to 'medium' when used with sonar-deep-research.
low
, medium
, high
OpenAI Compatible: The maximum number of completion tokens returned by the API. Controls the length of the model's response. If the response would exceed this limit, it will be truncated. Higher values allow for longer responses but may increase processing time and costs.
The amount of randomness in the response, valued between 0 and 2. Lower values (e.g., 0.1) make the output more focused, deterministic, and less creative. Higher values (e.g., 1.5) make the output more random and creative. Use lower values for factual/information retrieval tasks and higher values for creative applications.
0 <= x < 2
OpenAI Compatible: The nucleus sampling threshold, valued between 0 and 1. Controls the diversity of generated text by considering only the tokens whose cumulative probability exceeds the top_p value. Lower values (e.g., 0.5) make the output more focused and deterministic, while higher values (e.g., 0.95) allow for more diverse outputs. Often used as an alternative to temperature.
Perplexity-Specific: Determines whether search results should include images.
Perplexity-Specific: Determines whether related questions should be returned.
Perplexity-Specific: Filters search results based on time (e.g., 'week', 'day').
Perplexity-Specific: Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Perplexity-Specific: Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Perplexity-Specific: Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Perplexity-Specific: Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
OpenAI Compatible: The number of tokens to keep for top-k filtering. Limits the model to consider only the k most likely next tokens at each step. Lower values (e.g., 20) make the output more focused and deterministic, while higher values allow for more diverse outputs. A value of 0 disables this filter. Often used in conjunction with top_p to control output randomness.
OpenAI Compatible: Determines whether to stream the response incrementally.
OpenAI Compatible: Positive values increase the likelihood of discussing new topics. Applies a penalty to tokens that have already appeared in the text, encouraging the model to talk about new concepts. Values typically range from 0 (no penalty) to 2.0 (strong penalty). Higher values reduce repetition but may lead to more off-topic text.
OpenAI Compatible: Decreases likelihood of repetition based on prior frequency. Applies a penalty to tokens based on how frequently they've appeared in the text so far. Values typically range from 0 (no penalty) to 2.0 (strong penalty). Higher values (e.g., 1.5) reduce repetition of the same words and phrases. Useful for preventing the model from getting stuck in loops.
Enables structured JSON output formatting.
Perplexity-Specific: Configuration for using web search in model responses.
{ "search_context_size": "high" }
Response
OK
A unique identifier for the chat completion.
The model that generated the response.
The Unix timestamp (in seconds) of when the chat completion was created.
The type of object, which is always chat.completion
.
A list of chat completion choices. Can be more than one if n
is greater than 1.
A list of search results related to the response.