Categories: OpenAI

How to Build an LLM from Scratch | An Overview



💡 Get 30 (free) AI project ideas: https://30aiprojects.com/

This is the 6th video in a series on using large language models (LLMs) in practice. Here, I review key aspects of developing a foundation LLM based on the development of models such as GPT-3, Llama, Falcon, and beyond.

More Resources:
▶️ Series Playlist: https://www.youtube.com/playlist?list=PLz-ep5RbHosU2hnz5ejezwaYpdMutMVB0📰 Read more: https://medium.com/towards-data-science/how-to-build-an-llm-from-scratch-8c477768f1f9?sk=18c351c5cae9ac89df682dd14736a9f3

[1] BloombergGPT: https://arxiv.org/pdf/2303.17564.pdf
[2] Llama 2: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
[3] LLM Energy Costs: https://www.statista.com/statistics/1384401/energy-use-when-training-llm-models/
[4] arXiv:2005.14165 [cs.CL]
[5] Falcon 180b Blog: https://huggingface.co/blog/falcon-180b
[6] arXiv:2101.00027 [cs.CL]
[7] Alpaca Repo: https://github.com/gururise/AlpacaDataCleaned
[8] arXiv:2303.18223 [cs.CL]
[9] arXiv:2112.11446 [cs.CL]
[10] arXiv:1508.07909 [cs.CL]
[11] SentencePience: https://github.com/google/sentencepiece/tree/master
[12] Tokenizers Doc: https://huggingface.co/docs/tokenizers/quicktour
[13] arXiv:1706.03762 [cs.CL]
[14] Andrej Karpathy Lecture: https://www.youtube.com/watch?v=kCc8FmEb1nY&t=5307s
[15] Hugging Face NLP Course: https://huggingface.co/learn/nlp-course/chapter1/7?fw=pt
[16] arXiv:1810.04805 [cs.CL]
[17] arXiv:1910.13461 [cs.CL]
[18] arXiv:1603.05027 [cs.CV]
[19] arXiv:1607.06450 [stat.ML]
[20] arXiv:1803.02155 [cs.CL]
[21] arXiv:2203.15556 [cs.CL]
[22] Trained with Mixed Precision Nvidia: https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html
[23] DeepSpeed Doc: https://www.deepspeed.ai/training/
[24] https://paperswithcode.com/method/weight-decay
[25] https://towardsdatascience.com/what-is-gradient-clipping-b8e815cdfb48
[26] arXiv:2001.08361 [cs.LG]
[27] arXiv:1803.05457 [cs.AI]
[28] arXiv:1905.07830 [cs.CL]
[29] arXiv:2009.03300 [cs.CY]
[30] arXiv:2109.07958 [cs.CL]
[31] https://huggingface.co/blog/evaluating-mmlu-leaderboard
[32] https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf


Homepage: https://shawhintalebi.com/
Book a call: https://calendly.com/shawhintalebi

Intro – 0:00
How much does it cost? – 1:30
4 Key Steps – 3:55
Step 1: Data Curation – 4:19
1.1: Data Sources – 5:31
1.2: Data Diversity – 7:45
1.3: Data Preparation – 9:06
Step 2: Model Architecture (Transformers) – 13:17
2.1: 3 Types of Transformers – 15:13
2.2: Other Design Choices – 18:27
2.3: How big do I make it? – 22:45
Step 3: Training at Scale – 24:20
3.1: Training Stability – 26:52
3.2: Hyperparameters – 28:06
Step 4: Evaluation – 29:14
4.1: Multiple-choice Tasks – 30:22
4.2: Open-ended Tasks – 32:59
What’s next? – 34:31

source

staff

Share
Published by
staff

Recent Posts

Learn AI before its TOO LATE | Be10X Workshop Review

Enroll Now - Limited Seats Available! -https://be10x.in/ This video is an honest and detailed Be10x…

18 minutes ago

Forminator Pro – The Best Form Builder Plugin for WordPress

Not your typical form builder. Forminator is the easy-to-use WordPress form plugin for every website…

34 minutes ago

Smart ways to invest in AI beyond Nvidia, Palantir, and Meta

Listen and subscribe to Stocks In Translation on Apple Podcasts, Spotify, or wherever you find…

40 minutes ago

Get Gemini Pro and VEO 3 Free Access – VEO 3 Tutorial

New Method to Use Gemini Pro and Google VEO 3 for FREE Get ACCESS to…

44 minutes ago

Deep Learning Full Course🔥 – Learn Deep Learning in 6 Hours | Deep Learning Tutorial | Simplilearn

"️‍🔥Purdue - Professional Certificate in AI and Machine Learning - https://www.simplilearn.com/pgp-ai-machine-learning-certification-training-course?utm_campaign=ve-Tj7kUemg&utm_medium=DescriptionFFF&utm_source=Youtube ️‍🔥IITK - Professional Certificate…

1 hour ago

How to Clone Any Website in WordPress 🔷

In this video, I’ll show you how to clone any website into WordPress for FREE…

2 hours ago