Presence Penalty

Presence penalty is a parameter that reduces the likelihood of the model repeating tokens it has already mentioned. It encourages the model to introduce new topics or ideas, rather than sticking to the same concepts.

How does it work

During generation, the model keeps track of which tokens have already appeared. When a presence penalty is applied, the probability of repeating those tokens is lowered. This nudges the model to explore other directions or vocabulary.

When to use presence penalty

  • When the model repeats phrases or facts too often
  • When you want more diverse or creative outputs
  • When generating brainstorms, ideas, or lists
  • When encouraging the model to move forward instead of circling back

How to use presence penalty

  1. Set a value between -2.0 and 2.0. Most commonly between 0.5 and 1.5
  2. Add presence_penalty to your API call
  3. Test with different types of prompts. Especially open-ended or creative ones
  4. Adjust based on how much repetition you're seeing

Tips

  • Higher values result in more novelty, but can reduce coherence
  • Combine with frequency penalty for finer control over repetition (presence looks at whether something has appeared; frequency looks at how often)
  • Use with temperature > 0 to let the model explore a wider range of outputs
  • Not ideal for factual or structured tasks, may push the model off-topic