Most devs don’t understand how LLM tokens work



Most devs are using LLMs daily but don’t have a clue about some of the fundamentals. Understanding tokens is crucial because you need to know how you’re being billed, and why billing is different per provider.

Become an AI Hero with the AI Hero newsletter:

https://www.aihero.dev/newsletter

Code: https://github.com/mattpocock/ai-sdk-tips/tree/main/exercises/03-tokens

00:00 Intro
00:33 Input & Output Tokens
01:11 Monitoring Token Usage
02:16 What are tokens?
02:50 Tiktoken
03:47 Full LLM Process
04:49 Building Token Vocabularies
05:25 Character-Level Tokenizer
06:37 Vocabulary Size
07:20 Subword-Level Tokenizer
08:33 Building Longer Subwords
08:58 Unusual Words
09:45 Summary

source