How to Count Tokens Effectively

Tue, 01 Apr 2025 17:52:00 +0100

I. What Is a Token?

A token is a unit of text that the model processes. It could be a full word, part of a word, or even a special character.

Text	Number of Tokens
Hello	1
I am a developer	4
Artificial intelligence is fascinating!	5
GPT is a powerful model.	6

In English, short words are often 1 token (e.g., "Hello" = 1 token).
In French and other languages, longer words can be split into multiple tokens (e.g., "développeur" or "intelligence" = 2 tokens).
Punctuation also counts as tokens.
Spaces are included with the following word.
Acronyms are usually treated as 1 token.

On average, 100 tokens correspond to roughly 75 words, though this can vary depending on the language and writing style.