LLM Parameters Guide
Learn how to control and fine-tune language model behavior with core LLM parameters
Structured Outputs
Return model responses in predictable, machine-readable formats like JSON.
Top K
Control randomness by limiting token selection to the top K most likely options.
Top P (Nucleus Sampling)
Limit the token selection to a dynamically chosen set based on cumulative probability.
Tool Calling
Let the model return structured tool calls that can trigger backend actions.
JSON Mode
Force the model to return only valid JSON without extra text or formatting.
Streaming
Stream model output token-by-token in real time to improve responsiveness.
Max Tokens
Limit how many tokens the model can generate in its response.
Seed
Get repeatable results from the model by fixing the random generation seed.
Stop Sequence
Instruct the model to stop generating once a specific string is produced.
Presence Penalty
Discourage the model from repeating ideas or concepts it has already mentioned.
Frequency Penalty
Reduce how often the same token appears in a single response.
Logprobs
Inspect the model's confidence in each token it generates by returning probabilities.
Temperature
Control the creativity of the model's output by adjusting randomness.