Most devs are using LLMs daily but don’t have a clue about some of the fundamentals. Understanding tokens is crucial because you need to know how you’re being billed, and why billing is different per provider.
Become an AI Hero with the AI Hero newsletter:
https://www.aihero.dev/newsletter
Code: https://github.com/mattpocock/ai-sdk-tips/tree/main/exercises/03-tokens
00:00 Intro
00:33 Input & Output Tokens
01:11 Monitoring Token Usage
02:16 What are tokens?
02:50 Tiktoken
03:47 Full LLM Process
04:49 Building Token Vocabularies
05:25 Character-Level Tokenizer
06:37 Vocabulary Size
07:20 Subword-Level Tokenizer
08:33 Building Longer Subwords
08:58 Unusual Words
09:45 Summary
source
