byron wallace
@byron_c_wallace
Followers
1K
Following
2K
Media
20
Statuses
507
Assoc. Prof @Northeastern in Computer Science. NLP / ML / + Health &etc. he/him.
Joined July 2014
3/ 🏥 A separate team at Northeastern located where certain signals live inside Olmo and made targeted edits that reduced biased clinical predictions. This kind of audit is only possible because Olmo exposes all its components. → https://t.co/YF0EeomXJD
1
1
5
Olmo isn’t just open weights—it’s an open research stack. Try it in the Ai2 Playground: https://t.co/qGd4UW8ALv AMA on Discord: Tues, Oct 28 @ 8:00 AM PT with some of the researchers behind these studies + an Ai2 Olmo teammate. Join: https://t.co/GnxLPhM3MW
1
2
11
Syntax that spuriously correlates with safe domains can jailbreak LLMs -- e.g. below with GPT4o mini Our paper (co w/ @VMSuriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight! + @MarzyehGhassemi, @leventsagun, @byron_c_wallace
4
11
31
Just arrived in Montreal for #COLM2025 after a beautiful bus ride through Vermont. Excited to meet folks and talk about interpretability and language processing! 🍁 (you can also come find our paper at the 11am poster session tomorrow!)
Who is going to be at #COLM2025? I want to draw your attention to a COLM paper by my student @sheridan_feucht that has totally changed the way I think and teach about LLM representations. The work is worth knowing. And you meet Sheridan at COLM, Oct 7!
0
5
42
🔊 New work w/ Silvio Amir & @byron_c_wallace! We show you can distill a model’s mechanism, not just its answers -- teaching a small LM to run it's circuit same as a larger teacher model. We call it Circuit Distillation. (1/4)
1
2
4
Who is going to be at #COLM2025? I want to draw your attention to a COLM paper by my student @sheridan_feucht that has totally changed the way I think and teach about LLM representations. The work is worth knowing. And you meet Sheridan at COLM, Oct 7!
[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.
3
37
189
"AI slop" seems to be everywhere, but what exactly makes text feel like slop? In our new work (w/ @TuhinChakr, @dgolano, @byron_c_wallace) we provide a systematic attempt at measuring AI slop in text! https://t.co/9bKQceSjkn 🧵 (1/7)
help me fix get-4o slop reply with examples of slop behavior just a single sentence nothing crazy what annoys you what makes you wanna frisbee your laptop into a river i'll respond to every comment rt so we can maximize slop feedback help me de-sloptimize our models go
14
37
223
Our new paper asks: what is the goal of “natural language verbalization” interpretability approaches? If a verbalizer is supposed to tell us something about what’s in the target LM and NOT just what’s in the verbalizer LM, how do we actually evaluate that?
Wouldn’t it be great to have questions about LM internals answered in plain English? That’s the promise of verbalization interpretability. Unfortunately, our new paper shows that evaluating these methods is nuanced—and verbalizers might not tell us what we hope they do. 🧵👇1/9
0
2
16
Wouldn’t it be great to have questions about LM internals answered in plain English? That’s the promise of verbalization interpretability. Unfortunately, our new paper shows that evaluating these methods is nuanced—and verbalizers might not tell us what we hope they do. 🧵👇1/9
4
18
64
Out of a pressure to publish positive results, medical researchers sometimes spin their findings in published abstracts. But what happens when AI models are trained on that data? To read more on medical bias in today's AI models: https://t.co/BDNPKrRT7y
2
4
7
[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.
2
38
192
headed to miami/#EMNLP2024 next week with this work 🌴🌴 come check out the poster (12/11) and let’s talk about templates in text!
There's a general feeling that AI-written text is repetitive. But this repetition goes beyond phrases like "delve into"! We can actually characterize repetition at a structural level using syntactic templates... 🧵 https://t.co/TLQBsSqkId
1
7
52
How can you tell if text is #AI generated? Researchers at @Northeastern have figured out a new method analyzing sentence structure. Read more:
news.northeastern.edu
Researchers found that AI models tend to structure sentences in very specific, repetitious ways, more often than humans.
1
3
6
Come chat about @somin's intriguing CoT distillation results — put reasoning after labels, permute or keep only a few key CoT tokens — in Miami!
Planning to attend @emnlpmeeting? Come check out our work/poster in Miami on 11/12 🏝️
1
0
7
🚀 New NNsight features launching today! If you’re conducting research on LLM internals, NNsight 0.3 is now available. This update introduces advanced features, offering deeper insights for complex investigations into model behavior. 👇 Here’s what’s new: https://t.co/rVmIZ81upL
1
19
50
I’m very excited to join @Northeastern @KhouryCollege as an assistant professor starting Fall '25!! Looking forward to working with the amazing people there! Until then I'll be a postdoc at @ViennaNLP with Ben Roth, so reach out if you want to meet up while I'm over in Europe ✨
29
17
292
Time to study #llama3 405b, but gosh it's big! Please retweet: if you have a great experiment but not enough GPU, here is an opportunity to apply for shared #NDIF research resources. Deadline July 30: https://t.co/uHN3BxaR6c You'll help @ndif_team test, we'll help you run 405b
ndif.us
NDIF is a research computing project that enables researchers and students to crack open the mysteries inside large-scale AI systems.
Frontier LLMs have capabilities that smaller AIs don't, but up to now there's been no way to crack them open. Now that #Llama3 405b is here, what's the most interesting experiment YOU want to do? 🚀 Apply at https://t.co/Zo3ALEt8td to make it happen and read for details 🧵⬇️
2
38
121
NNsight and NDIF Democratizing Access to Foundation Model Internals The enormous scale of state-of-the-art foundation models has limited their accessibility to scientists, because customized experiments at large model sizes require costly hardware and complex engineering
2
25
72
1/10 Excited to share our #BioNLP at #ACL2024NLP paper “Open (Clinical) LLMs are Sensitive to Instruction Phrasings", paper 📄 :
arxiv.org
Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This...
3
10
55