Luke Zettlemoyer @LukeZettlemoyer X Profile

Luke Zettlemoyer

@LukeZettlemoyer

Followers

10K

Following

6K

Media

1

Statuses

2K

Joined September 2015

Don't wanna be here? Send us removal request.

Bryan Catanzaro

@ctnzr

12 hours

Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.

27

153

788

Benjamin Minixhofer

@bminixhofer

9 hours

We are releasing Bolmo today! Bolmo is the best byte-level model so far. It comes close to and sometimes surpasses Olmo 3. Bolmo also performs competitively in terms of speed & is fully open. I was skeptical of byte-level models for a long time but I finally switched camps🧵

Ai2

@allen_ai

9 hours

Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵

6

15

68

Hanna Hajishirzi

@HannaHajishirzi

3 days

🚀 Olmo 3.1 is here — earlier than expected! 32B Think: 3 extra weeks of RL training = steady gains and significant improvements. 32B Instruct: Our 7B recipe scaled to 32B, tuned for short chat + function calling. Olmo 3 keeps leveling up! Details in the latest version of the

Ai2

@allen_ai

3 days

Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵

13

43

426

Nathan Lambert

@natolambert

5 days

It's finally here: The public (and most complete) version of my talk covering every stage of the process to build Olmo 3 Think. This involves changes and new considerations of every angle of the stack, from pretraining, evaluation, and of course post-training.

12

81

715

Tim Dettmers

@Tim_Dettmers

5 days

My new blog post discusses the physical reality of computation and why this means we will not see AGI or any meaningful superintelligence:

timdettmers.com

If you are reading this, you probably have strong opinions about AGI, superintelligence, and the future of AI. Maybe you believe we are on the cusp of a transformative breakthrough. Maybe you are...

164

172

1K

Inna Lin

@iwylin

5 days

📢 New Paper 📢 Self-Improving VLM Judges Without Human Annotations Reward models & judges are critical for evaluating output quality and alignment with human preferences for VLM training. Current training approaches typically rely on: 💸 Costly human preference annotations 🔒

2

20

72

Perceptron AI

@perceptroninc

5 days

Today we’re open-sourcing a preview of our two new models in the Isaac family: hybrid-reasoning 2B and 1B-parameter best-in-class vision-language models. Weights → https://t.co/1WgHMDfCST Blog → https://t.co/8MOLPKpUhO Demo → https://t.co/sAKt5dnZ6U

3

14

53

Xiaochuang Han

@XiaochuangHan

6 days

Can we simplify video generation by decomposing it into interleaved text-video co-generation? Would explicit, repeated thinking in language improve generation in pixels? We introduce TV2TV: a unified model that jointly learns - language modeling (next-token prediction) - video

4

37

86

Rada Mihalcea

@radamihalcea

12 days

😂 Research is often very serious… but it doesn’t always have to be! If you are interested in computational humor, or just need a fun break from your current work, join us for MWAHAHA ⬇️ Tasks include both text and multimodal humor generation, in multiple languages

pln-fing-udelar.github.io

We want you to compete in making the funniest computer program.

MWAHAHA: a Humor Gen Competition

@HumorGen2026

17 days

Do you think you can create a computer program that can make jokes?? Participate in MWAHAHA!

1

8

13

Tong Chen @ NeurIPS

@tomchen0

13 days

I will be at #NeurIPS2025 12.3–12.7 Looking forward to meeting old and new friends ! ☕️🌮 Recently working on hallucination (Binary RAR) and verbatim memorization (ParaPO), issues that scaling up pretraining cannot simply fix. Also interested in making models learn more like

1

5

36

Azalia Mirhoseini

@Azaliamirh

13 days

Thrilled to share that @annadgoldie and I are launching @RicursiveAI, a frontier lab enabling recursive self-improvement through AIs that design their own chips. Our vision for transforming chip design began with AlphaChip, an AI for layout optimization used to design four

wsj.com

Founded by ex-Google researchers, the company raised $35 million with backing from Sequoia to automate chip design.

Ricursive Intelligence

@RicursiveAI

13 days

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at https://t.co/cSpbrQwwEn

123

136

1K

Heng Ji

@hengjinlp

13 days

I’m thrilled to announce that I’m launching a new startup dedicated to patient-centric AI for drug discovery, and we’re hiring Founding AI Engineers who are passionate about advancing healthcare through cutting-edge AI. Apply here by Jan 10:

2

34

359

Shizhe Diao✈️NeurIPS 2025

@shizhediao

19 days

🚀 Excited to share ToolOrchestra, an end-to-end RL training framework for orchestrating tools and agentic workflows. Everyone’s building agent workflows these days — connecting tools, APIs, and LLMs like LEGO. 🧩 But here are our findings: 👉 Just prompting the agent workflow

24

68

314

Sam Lehman

@SPLehman

19 days

Attending @NeurIPSConf and interested in distributed, modular, and/or open AI? Hadn't seen someone put together a list of poster presentations in this area so took it upon myself to thread out who I'm excited to talk to next week🧵

5

48

Yu Su

@ysu_nlp

20 days

Life update: I moved to silicon valley to tackle agents' biggest challenges: plasticity and reliability. Today's agents are smart but brittle. They lack plasticity (continual learning and adaptation) and reliability (stable, predictable behavior with bounded failures). These two

40

43

421

Stella Li @NeurIPS 2025

@StellaLisy

25 days

First time so happy seeing negative results on a paper😆 moo moo rawrrrrrr!!!! Negative signals from training on spurious rewards shows successful decontamination of our training data. Things we can do only due to fully open data + model + training + eval✨

Michael Noukhovitch

@mnoukhov

25 days

Because Olmo 3 is fully open, we decontaminate our evals from our pretraining and midtraining data. @StellaLisy proves this with spurious rewards: RL trained on a random reward signal can't improve on the evals, unlike some previous setups

2

10

149

Ian Magnusson

@IanMagnusson

25 days

One key to making a 🔥 LM: ☢️🧼remove benchmark contamination 📊🤔then make the right development decisions by not overestimating performance!

Ai2

@allen_ai

26 days

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

1

4

33

Ai2

@allen_ai

26 days

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

55

332

2K

Rulin Shao

@RulinShao

27 days

🔥Thrilled to introduce DR Tulu-8B, an open long-form Deep Research model that matches OpenAI DR 💪Yes, just 8B! 🚀 The secret? We present Reinforcement Learning with Evolving Rubrics (RLER) for long-form non-verifiable DR tasks! Our rubrics: - co-evolve with the policy model -

8

115

542

Tong Chen @ NeurIPS

@tomchen0

1 month

OpenAI's blog ( https://t.co/Mu05PFfPXg) points out that today’s language models hallucinate because training and evaluation reward guessing instead of admitting uncertainty. This raises a natural question: can we reduce hallucination without hurting utility?🤔 On-policy RL with

26

123

674