Presence Penalty

Presence penalty is a parameter that reduces the likelihood of the model repeating tokens it has already mentioned. It encourages the model to introduce new topics or ideas, rather than sticking to the same concepts.

How does it work

During generation, the model keeps track of which tokens have already appeared. When a presence penalty is applied, the probability of repeating those tokens is lowered. This nudges the model to explore other directions or vocabulary.

When to use presence penalty

When the model repeats phrases or facts too often
When you want more diverse or creative outputs
When generating brainstorms, ideas, or lists
When encouraging the model to move forward instead of circling back

How to use presence penalty

Set a value between -2.0 and 2.0. Most commonly between 0.5 and 1.5
Add presence_penalty to your API call
Test with different types of prompts. Especially open-ended or creative ones
Adjust based on how much repetition you're seeing

Tips

Higher values result in more novelty, but can reduce coherence
Combine with frequency penalty for finer control over repetition (presence looks at whether something has appeared; frequency looks at how often)
Use with temperature > 0 to let the model explore a wider range of outputs
Not ideal for factual or structured tasks, may push the model off-topic

LLM Parameters Guide

Presence Penalty

How does it work

When to use presence penalty

How to use presence penalty

Tips