Learn to align LLMs through post-training in this new course with AMD!
Learn more: https://bit.ly/47ict9O
Learn to align and optimize LLMs for real-world applications through post-training. In this course, created in partnership with AMD, you’ll learn how to apply fine-tuning and reinforcement learning techniques to shape model behavior, improve reasoning, and make LLMs safer and more reliable.
Large language models are powerful, but raw pretrained models aren’t ready for production applications. Post-training is what adapts an LLM to follow instructions, show reasoning, and behave more safely.
Many developers still assume “LLMs inherently hallucinate,” or “only experts can tune models.” Recent advances have changed what’s feasible. If you ship LLM features (e.g., developer copilots, customer support agents, internal assistants) or work on ML/AI platform teams, understanding post-training is becoming a must-have skill.
This course, consisting of 5 modules and taught by Sharon Zhou (VP of AI at AMD and instructor to popular DeepLearning.AI courses), will guide you through various aspects of post-training:
– Post-training in the LLM lifecycle: Learn where post-training fits, key ideas in fine-tuning and RL, how models gain reasoning, and how these methods power products.
– Core techniques: Understand fine-tuning, RLHF, reward modeling, and RL algorithms (PPO, GRPO). Use LoRA for efficient fine-tuning.
– Evaluation and error analysis: Design evals, detect reward hacking, diagnose failures, and red team to test model robustness.
– Data for post-training: Prepare fine-tuning/LoRA datasets, combine fine-tuning + RLHF, create synthetic data, and balance data and rewards.
– From post-training to production: Learn industry-leading production pipelines, set go/no-go rules, and run data feeedback loops from your logs.
Enroll now: https://bit.ly/47ict9O
source
