Philippe Laban @PhilippeLaban X Profile

Philippe Laban

@PhilippeLaban

Followers

1K

Following

3K

Media

44

Statuses

330

Research Scientist @MSFTResearch. NLP/HCI Research.

New York City

Joined April 2022

Don't wanna be here? Send us removal request.

Arkadiy Saakyan

@rkdsaakyan

4 days

N-gram novelty is widely used as a measure of creativity and generalization. But if LLMs produce highly n-gram novel expressions that don’t make sense or sound awkward, should they still be called creative? In a new paper, we investigate how n-gram novelty relates to creativity.

1

13

42

Tiancheng Hu

@tiancheng_hu

11 days

Can AI simulate human behavior? 🧠 The promise is revolutionary for science & policy. But there’s a huge "IF": Do these simulations actually reflect reality? To find out, we introduce SimBench: The first large-scale benchmark for group-level social simulation. (1/9)

3

21

52

Jenna Russell

@jennajrussell

17 days

AI is already at work in American newsrooms. We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea. Here's what we learned about how AI is influencing local and national journalism:

4

52

143

Tuhin Chakrabarty

@TuhinChakr

18 days

🚨New paper on AI and copyright Several authors have sued LLM companies for allegedly using their books without permission for model training. 👩‍⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this

9

171

524

Alexis Ross

@alexisjross

25 days

Can LLMs reason like a student? 👩🏻‍🎓📚✏️ For educational tools like AI tutors, modeling how students make mistakes is crucial. But current LLMs are much worse at simulating student errors ❌ than performing correct ✅ reasoning. We try to fix that with our method MISTAKE 🤭👇

11

56

336

Alexis Ross

@alexisjross

1 month

New preprint on AI/CS education‼️ We ask what we can learn abt both code & coders (students learning to code) by training on their full coding traces. Hint: we get richer models of *diverse student behavior* that are also more *generalizable & controllable*! Thread below ⬇️

Megha Srivastava

@megha_byte

1 month

New preprint on AI + Education! 🍎 “Modeling Student Learning with 3.8M Program Traces” 💻 When students code, their edits tell a story about their reasoning process: exploring, debugging, and tinkering 🧠 What can LMs learn from training on student edit sequences? 📚

1

14

83

Alexis Ross

@alexisjross

28 days

One of my takeaways from #COLM2025 was that people are thinking a lot about user simulation (have been thinking about this myself in the context of tutoring!) Really exciting to see this work on the topic 🤩

Tarek Naous

@tareknaous

29 days

Simulating user–AI conversations helps us understand how LMs work in multi-turn settings. Prompting LMs like GPT-4o to simulate users is common, but their assistant nature makes it hard to replicate user behavior. We introduce User LMs - trained to be users, not assistants.

7

13

110

Tarek Naous

@tareknaous

29 days

Simulating user–AI conversations helps us understand how LMs work in multi-turn settings. Prompting LMs like GPT-4o to simulate users is common, but their assistant nature makes it hard to replicate user behavior. We introduce User LMs - trained to be users, not assistants.

2

27

146

Victor M

@victormustar

30 days

Microsoft did something interesting here 👀 “Unlike typical LLMs that are trained to play the role of the "assistant" in conversation, we trained UserLM-8b to simulate the “user” role in conversation” https://t.co/mGgWZBvu7o

huggingface.co

49

180

2K

Siddharth (Sid) Suri

@ssuri

1 month

The AI, Interaction, and Learning team at Microsoft Research is looking for interns! If you're working on your PhD in computer science, statistics, economics, computational social science or related fields apply: Research Intern MSR AI Interaction and Learning | Microsoft Careers

1

6

49

Katherine Thai

@kthai1618

1 month

As a case study, we built a dataset by applying 9 different Grammarly edits to the same text. According to EditLens, “Fix any mistakes” is the most mild change, while “Make it more detailed” and “Summarize it” are the most invasive. 7/

1

9

Max Spero

@max_spero_

1 month

We can now quantify the magnitude of AI edits in a text! Coming soon to Pangram.

Katherine Thai

@kthai1618

1 month

As a case study, we built a dataset by applying 9 different Grammarly edits to the same text. According to EditLens, “Fix any mistakes” is the most mild change, while “Make it more detailed” and “Summarize it” are the most invasive. 7/

5

6

42

Philippe Laban

@PhilippeLaban

1 month

Come see us in COLM!! More importantly, if you're thinking of doing a PhD, go work with wonderful Tuhin!

Tuhin Chakrabarty

@TuhinChakr

1 month

I am @COLM_conf in Montreal. @PhilippeLaban and I will present work on #AISlop and Calibrated Reward Models for Writing. I will also be admitting 1 PhD student next fall at @sbucompsc to work on Human Centered AI / AI detection / Copyright and Creative Labor. Reach out !!

1

5

34

Tuhin Chakrabarty

@TuhinChakr

1 month

I am @COLM_conf in Montreal. @PhilippeLaban and I will present work on #AISlop and Calibrated Reward Models for Writing. I will also be admitting 1 PhD student next fall at @sbucompsc to work on Human Centered AI / AI detection / Copyright and Creative Labor. Reach out !!

Tuhin Chakrabarty

@TuhinChakr

7 months

Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts.

1

10

39

Tanya Goyal

@tanyaagoyal

1 month

🚨Modeling Abstention via Selective Help-seeking LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not? @momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!

1

21

36

Max Spero

@max_spero_

2 months

Good news, @emollick! We finally got our independent study of FPR. @alexolegimas and @brian_jabarian studied Pangram alongside other AI detectors and found that Pangram had zero false positives (at a threshold of 0.5) among their dataset of 7,968 human writing samples.

Ethan Mollick

@emollick

6 months

Getting lots of replies pushing Pangram Labs here. They claim very low false positive rates on their website. I remain doubtful without independent assessment of false positives (this study was not meant to do that), & concerned that these detectors are used adversarially.

2

5

41

Brian Jabarian

@brian_jabarian

2 months

AI-generated text is everywhere: hard for orgs to assess human performance. Can we detect it while min false accusations? Yes! With @alexolegimas we audit detectors, show incredible accuracy ~0 (!!) false pos & neg; and we offer a policy framework for evaluating trade-offs. 🧵

20

78

330

Yoonjoo Lee @ COLM 2025

@yoonjoo_le2

2 months

🎓I officially defended my PhD! Huge thanks to my amazing advisor @juhokim and committee @eytanadar @aliceoh @tongshuangwu @seo_minjoon. This fall, I'm excited to join @UMichCSE as a postdoc with @QVeraLiao to continue my research in human-centered AI and cognitive alignment!💙

48

11

325

Tae Soo Kim

@tae_skim

3 months

You ask ChatGPT to write a message for: 💼 Credit-stealing colleague → “Building on *my* idea…” 🏠 Messy roommate → “babe wake up new mold just dropped” Same you. Same task. Different context. Can LLMs learn this? 🤔 We built CUPID 🏹 to find out. 🔗 https://t.co/JrAdy2SbSV

4

23

68

Ai2

@allen_ai

3 months

With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

36

79

751