Alexandros
@alexk_z
Followers
2K
Following
26K
Media
95
Statuses
3K
ML AI RL & Snowboarding
CH
Joined September 2012
Ahahahaha, the James Webb Space Telescope continues to deliver massive L’s for astrophysics. A new paper shows that the “Cosmic Microwave Background Radiation” can be explained entirely by the energy of recently discovered Early Mature Galaxies — massive galaxies that the JWST
987
2K
19K
Introducing Reinforcement-Learned Teachers (RLTs): Transforming how we teach LLMs to reason with reinforcement learning (RL). Blog: https://t.co/mbfAzlvGY8 Paper: https://t.co/UN4p5dUWlU Traditional RL focuses on “learning to solve” challenging problems with expensive LLMs and
25
246
1K
Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known
49
160
998
Tim Rocktäschel’s keynote talk at #ICLR2025 about Open-Endedness and AI. “Almost no prerequisite to any major invention was invented with that invention in mind.” “Basically almost everybody in my lab at UCL and at DeepMind have read this book: Why Greatness Cannot Be Planned.”
11
83
530
This paper is pretty cool: The Belief State Transformer Very simple technique and fast to train, makes transformers (or other seq models) better at modelling state and can additionally condition on the end! I wonder what this is like for RL, we might condition on high end reward!
16
96
637
I am pleased to share the full set of videolectures, slides, textbook, and other supporting material of the 7th offering of my Reinforcement Learning class at ASU, which was completed two days ago; check
16
235
1K
📢LLM and RL folks! 📢 No good RL algorithm for credit assignment for multi-turn LLM agents on reasoning-heavy tasks? Do not even have a good benchmark for studying it? In SWEET-RL, we give you both (a vibe coding benchmark and SWEET algorithm). A thread 🧵(1/n)
3
81
376
"Ας αφήσουμε τα παιδιά του Μωάμεθ να αποτελειώσουν τα παιδιά του Ροβεσπιέρου" (Παλαιών Πατρών Γερμανός, 1820): Πως αντιμετωπίστηκαν πολλοί ήρωες του 1821 απο ντόπιους αστούς, κοτζαμπάσηδες, αρχιρασοφόρους κ.α που πήραν τελικά την θέση των Οθωμανών στην εκμετάλλευση του λαού; Ο
11
221
613
☄️ GRPO now scales to 70B+ models with multi-node training and super-fast performance. Install the latest v0.16 version of TRL pip install trl With all these the freshest features and optimizations that we've added, you can train up to 60 times faster! More details in the
16
94
741
Introducing Mistral Small 3.1. Multimodal, Apache 2.0, outperforms Gemma 3 and GPT 4o-mini. https://t.co/BHLAAaKZ9w
269
1K
8K
Το χειροκρότημα των βολεμένων στην καρέκλα τους Βουλευτών της Νέας Δημοκρατίας για την απόρριψη της πρότασης δυσπιστίας, επισφράγισε τη συμμετοχή τους στα όσα σκανδαλώδη απο τη πρώτη στιγμή οργανώνει, μεθοδεύει και διαπράττει ο Πρωθυπουργός και οι Υπουργοί τους. Κανένας από
588
2K
8K
Ενώ το ραντεβού μας στις #28_Φλεβαρη για τα #Τεμπη_συγκαλυψη στους δρόμους είναι σε λίγες ώρες και το πόρισμα του #ΕΟΔΑΣΑΑΜ για τα #τεμπη_εγκλημα μίλησε για 2,5 τόνους εύφλεκτης "άγνωστης" ουσίας και ασύλληπτες και εγκληματικές παραλείψεις (#Justice_for_Tempi), με τον… 1/36
3
22
80
I think the community is excited about DeepSeek v3 not because it's yet another powerful model but because it's a story of human ingenuity in the face of constraints. Despite all the restrictions due to export control and limited budget, the humans of DeepSeek have created a
23
184
2K
Training LLMs to Reason in a Continuous Latent Space Meta presents Coconut (Chain of Continuous Thought), a novel paradigm that enables LLMs to reason in continuous latent space rather than natural language. Coconut takes the last hidden state of the LLM as the reasoning state
17
91
425
Introducing An Evolved Universal Transformer Memory https://t.co/IKt4l4YPvA Neural Attention Memory Models (NAMMs) are a new kind of neural memory system for Transformers that not only boost their performance and efficiency but are also transferable to other foundation models,
8
118
472
Spent the weekend hacking together Exa embeddings over 4500 NeurIPS 2024 papers - https://t.co/gazgno2hfk Let's you: - do otherwise impossible searches ("transformer architectures inspired by neuroscience") - explore a 2D t-SNE plot - chat with Claude about multiple papers
28
79
670
📢 Attending #NeurIPS2024 ? Come by our workshop on open-world agents! everything 👉 https://t.co/oTQloQSMkx Put your questions for the panel here: https://t.co/purWnsCntW Our speakers & panelists lining up: @mengjiao_yang @taoyds @xiao_ted @natashajaques @jiajunwu_cs
2
12
56
@NeurIPSConf 2024 reimagined with AI !! - summaries for instant insights 🧠 - easy-to-understand audio podcasts 🎙️ - quick links to NeurIPS Proc., @huggingface & more 🌐 - Full papers, topic & affiliation filters 📂 All your research needs, in one hub. Dive in now! 👇
6
56
162