Nested Learning: The Illusion of Deep Learning Architectures – Paper Walkthrough
This video explains the core ideas behind Nested Learning (NL), a new machine learning paradigm that reframes deep learning architectures as multi-level optimization systems with multi-time-scale memory. We explore how NL reinterprets optimizers as associative memory, introduces Deep Optimizers like DMGD, expands memory through Continuum Memory Systems, and demonstrates how the HOPE architecture outperforms Transformers and modern recurrent models for language modeling and reasoning.
*Related Videos*
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
DeepSeek-R1: https://youtu.be/RmTyzx-17Do
The Illusion of Thinking: https://youtu.be/VBP3FojvpJw
SAM2: https://youtu.be/DCtEpYFOtiQ
Google’s AlphaEvolve: https://youtu.be/d1_9p498jbg
Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models: https://youtu.be/Lar3K2gN454
Why Language Models Hallucinate: https://youtu.be/R5YRdJGeZTM
Transformer Self-Attention Mechanism Explained: https://youtu.be/u8pSGp__0Xk
Jailbroken: How Does LLM Safety Training Fail? – Paper Explained: https://youtu.be/sKEZChVe6AQ
How to Fine-tune Large Language Models Like ChatGPT with Low-Rank Adaptation (LoRA): https://youtu.be/CNmsM6JGJz0
Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped Query Attention (GQA) Explained: https://youtu.be/o68RRGxAtDo
LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p: https://youtu.be/-BBulGM6xF0
*Follow Me*
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
🐦 X: @datamlistic https://x.com/datamlistic
📸 Instagram: @datamlistic https://www.instagram.com/datamlistic
📱 TikTok: @datamlistic https://www.tiktok.com/@datamlistic
*Channel Support*
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
The best way to support the channel is to share the content. 😉
If you’d like to also support the channel financially, donating the price of a coffee is always warmly welcomed! (completely optional and voluntary)
► Patreon: https://www.patreon.com/datamlistic
► Bitcoin (BTC): 3C6Pkzyb5CjAUYrJxmpCaaNPVRgRVxxyTq
► Ethereum (ETH): 0x9Ac4eB94386C3e02b96599C05B7a8C71773c9281
► Cardano (ADA): addr1v95rfxlslfzkvd8sr3exkh7st4qmgj4ywf5zcaxgqgdyunsj5juw5
► Tether (USDT): 0xeC261d9b2EE4B6997a6a424067af165BAA4afE1a
#qwen3i #llm #airesearch #machinelearning
source
