Nested Learning: The Illusion of Deep Learning Architectures
Nested Learning: The Illusion of Deep Learning Architectures
The academic paper “Nested Learning: The Illusion of Deep Learning Architectures” introduces Nested Learning (NL), a new paradigm that models machine learning systems as nested, multi-level optimization problems, each with its own “context flow.” The authors argue that existing deep learning methods, including Transformers, learn by compressing their context flow, which explains phenomena like in-context learning. NL is presented as a path toward designing more expressive learning algorithms, leading to three core contributions: Deep Optimizers (reformulating gradient-based optimizers as associative memory modules), Self-Modifying Titans (a sequence model that learns its own update algorithm), and a Continuum Memory System (generalizing long/short-term memory concepts). The paper culminates in the introduction of the HOPE architecture, a self-referential learning module leveraging the continuum memory system, which demonstrates strong performance on language modeling and common-sense reasoning tasks against several baselines. Finally, the source includes a NeurIPS checklist confirming the paper’s claims, limitations, reproducibility, and ethical considerations.
paper – https://abehrouz.github.io/files/NL.pdf
subscribe – https://t.me/arxivpaper
donations:
USDT: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
BTC: bc1q8972egrt38f5ye5klv3yye0996k2jjsz2zthpr
ETH: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
SOL: DXnz1nd6oVm7evDJk25Z2wFSstEH8mcA1dzWDCVjUj9e
created with NotebookLM
source
