Try out Warp 2.0 now, the current rank #1 AI on Terminal Bench, outperforming Claude Code: https://go.warp.dev/bycloud
You can also use code “BYCLOUD” to get Warp Pro for 1 month free. (limited for 1,000 redemptions)
My Newsletter
https://mail.bycloud.ai/
my project: find, discover & explain AI research semantically
https://findmypapers.ai/
My Patreon (get bundle access for my newsletter & findmypapers)
https://www.patreon.com/c/bycloud
Training language models to follow instructions with human feedback
[Paper] https://arxiv.org/abs/2203.02155
DeepSeek-R1 (Aha Moment)
[Paper] https://arxiv.org/abs/2501.12948
Understanding R1-Zero-Like Training: A Critical Perspective
[Paper] https://arxiv.org/pdf/2503.20783
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
[Paper] https://arxiv.org/abs/2504.13837
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
[Paper] https://arxiv.org/abs/2505.11711
Spurious Rewards: Rethinking Training Signals in RLVR
[Paper] https://arxiv.org/abs/2506.10947
Try out my new fav place to learn how to code https://scrimba.com/?via=bycloudAI
This video is supported by the kind Patrons & YouTube Members:
🙏Nous Research, Chris LeDoux, Ben Shaener, DX Research Group, Poof N’ Inu, Andrew Lescelius, Deagan, Robert Zawiasa, Ryszard Warzocha, Tobe2d, Louis Muk, Akkusativ, Kevin Tai, Mark Buckler, NO U, Tony Jimenez, Ângelo Fonseca, jiye, Anushka, Asad Dhamani, Binnie Yiu, Calvin Yan, Clayton Ford, Diego Silva, Etrotta, Gonzalo Fidalgo, Handenon, Hector, Jake Disco very, Michael Brenner, Nilly K, OlegWock, Daddy Wen, Shuhong Chen, Sid_Cipher, Stefan Lorenz, Sup, tantan assawade, Thipok Tham, Thomas Di Martino, Thomas Lin, Richárd Nagyfi, Paperboy, mika, Leo, Berhane-Meskel, Kadhai Pesalam, mayssam, Bill Mangrum, nyaa,
Toru Mon
[Discord] https://discord.gg/NhJZGtH
[Twitter] https://twitter.com/bycloudai
[Patreon] https://www.patreon.com/bycloud
[Business Inquiries] [email protected]
[Profile & Banner Art] https://twitter.com/pygm7
[Video Editor] @Booga04
[Ko-fi] https://ko-fi.com/bycloudai
source
