Frequency Penalty

Frequency penalty reduces the chance that the model will repeat the same tokens multiple times. It helps discourage overused words or phrases in the generated output.

How does it work

As the model generates tokens, it tracks how many times each token has already appeared. The more frequently a token appears, the more its probability is penalized. This encourages the model to vary its wording rather than reuse the same terms repeatedly.

When to use frequency penalty

When the model repeats words or phrases too often
When you want cleaner, less redundant output
When generating lists, paragraphs, or creative writing
When improving the readability of longer completions

How to use frequency penalty

Set a value between -2.0 and 2.0. Most common values range from 0.5 to 1.5
Add frequency_penalty to your API call
Test the output. Increase the penalty if repetition remains
Adjust in combination with other parameters like temperature or presence_penalty

Tips

Higher values reduce repetition but may impact coherence if set too high
Presence penalty penalizes whether a token has appeared at all; frequency penalty is based on how many times it appears
Use in chatbots, content generation, and summaries where repeated language is common
Can be especially helpful when the model gets stuck repeating similar phrases

LLM Parameters Guide

Frequency Penalty

How does it work

When to use frequency penalty

How to use frequency penalty

Tips