byron wallace @byron_c_wallace X Profile

byron wallace

@byron_c_wallace

Followers

1K

Following

2K

Media

20

Statuses

507

Assoc. Prof @Northeastern in Computer Science. NLP / ML / + Health &etc. he/him.

https://t.co/KWxL7rReN7

Joined July 2014

Don't wanna be here? Send us removal request.

Ai2

@allen_ai

16 days

3/ 🏥 A separate team at Northeastern located where certain signals live inside Olmo and made targeted edits that reduced biased clinical predictions. This kind of audit is only possible because Olmo exposes all its components. → https://t.co/YF0EeomXJD

1

5

Ai2

@allen_ai

16 days

Olmo isn’t just open weights—it’s an open research stack. Try it in the Ai2 Playground: https://t.co/qGd4UW8ALv AMA on Discord: Tues, Oct 28 @ 8:00 AM PT with some of the researchers behind these studies + an Ai2 Olmo teammate. Join: https://t.co/GnxLPhM3MW

1

2

11

Chantal

@ChantalShaib

16 days

Syntax that spuriously correlates with safe domains can jailbreak LLMs -- e.g. below with GPT4o mini Our paper (co w/ @VMSuriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight! + @MarzyehGhassemi, @leventsagun, @byron_c_wallace

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭

@elder_plinius

1 month

hit me with your favorite jailbreak below ⬇️

4

11

31

Sheridan Feucht

@sheridan_feucht

1 month

Just arrived in Montreal for #COLM2025 after a beautiful bus ride through Vermont. Excited to meet folks and talk about interpretability and language processing! 🍁 (you can also come find our paper at the 11am poster session tomorrow!)

David Bau

@davidbau

1 month

Who is going to be at #COLM2025? I want to draw your attention to a COLM paper by my student @sheridan_feucht that has totally changed the way I think and teach about LLM representations. The work is worth knowing. And you meet Sheridan at COLM, Oct 7!

0

5

42

Somin

@SominW

1 month

🔊 New work w/ Silvio Amir & @byron_c_wallace! We show you can distill a model’s mechanism, not just its answers -- teaching a small LM to run it's circuit same as a larger teacher model. We call it Circuit Distillation. (1/4)

1

2

4

David Bau

@davidbau

1 month

Who is going to be at #COLM2025? I want to draw your attention to a COLM paper by my student @sheridan_feucht that has totally changed the way I think and teach about LLM representations. The work is worth knowing. And you meet Sheridan at COLM, Oct 7!

Sheridan Feucht

@sheridan_feucht

7 months

[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.

3

37

189

Chantal

@ChantalShaib

2 months

"AI slop" seems to be everywhere, but what exactly makes text feel like slop? In our new work (w/ @TuhinChakr, @dgolano, @byron_c_wallace) we provide a systematic attempt at measuring AI slop in text! https://t.co/9bKQceSjkn 🧵 (1/7)

Aidan McLaughlin

@aidan_mclau

9 months

help me fix get-4o slop reply with examples of slop behavior just a single sentence nothing crazy what annoys you what makes you wanna frisbee your laptop into a river i'll respond to every comment rt so we can maximize slop feedback help me de-sloptimize our models go

14

37

223

Naomi Saphra

@nsaphra

2 months

Our new paper asks: what is the goal of “natural language verbalization” interpretability approaches? If a verbalizer is supposed to tell us something about what’s in the target LM and NOT just what’s in the verbalizer LM, how do we actually evaluate that?

Millicent Li

@millicent_li

2 months

Wouldn’t it be great to have questions about LM internals answered in plain English? That’s the promise of verbalization interpretability. Unfortunately, our new paper shows that evaluating these methods is nuanced—and verbalizers might not tell us what we hope they do. 🧵👇1/9

0

2

16

Millicent Li

@millicent_li

2 months

Wouldn’t it be great to have questions about LM internals answered in plain English? That’s the promise of verbalization interpretability. Unfortunately, our new paper shows that evaluating these methods is nuanced—and verbalizers might not tell us what we hope they do. 🧵👇1/9

4

18

64

Khoury College of Computer Sciences

@KhouryCollege

3 months

Out of a pressure to publish positive results, medical researchers sometimes spin their findings in published abstracts. But what happens when AI models are trained on that data? To read more on medical bias in today's AI models: https://t.co/BDNPKrRT7y

2

4

7

Sheridan Feucht

@sheridan_feucht

7 months

[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.

2

38

192

Chantal

@ChantalShaib

1 year

headed to miami/#EMNLP2024 next week with this work 🌴🌴 come check out the poster (12/11) and let’s talk about templates in text!

Chantal

@ChantalShaib

1 year

There's a general feeling that AI-written text is repetitive. But this repetition goes beyond phrases like "delve into"! We can actually characterize repetition at a structural level using syntactic templates... 🧵 https://t.co/TLQBsSqkId

1

7

52

NGN Research

@ResearchAtNU

1 year

How can you tell if text is #AI generated? Researchers at @Northeastern have figured out a new method analyzing sentence structure. Read more:

news.northeastern.edu

Researchers found that AI models tend to structure sentences in very specific, repetitious ways, more often than humans.

1

3

6

byron wallace

@byron_c_wallace

1 year

My bad, that's @SominW.

0

3

byron wallace

@byron_c_wallace

1 year

Come chat about @somin's intriguing CoT distillation results — put reasoning after labels, permute or keep only a few key CoT tokens — in Miami!

Somin

@SominW

1 year

Planning to attend @emnlpmeeting? Come check out our work/poster in Miami on 11/12 🏝️

1

0

7

Jaden Fiotto-Kaufman

@jadenfk23

1 year

🚀 New NNsight features launching today! If you’re conducting research on LLM internals, NNsight 0.3 is now available. This update introduces advanced features, offering deeper insights for complex investigations into model behavior. 👇 Here’s what’s new: https://t.co/rVmIZ81upL

1

19

50

Terra Blevins

@TerraBlvns

1 year

I’m very excited to join @Northeastern @KhouryCollege as an assistant professor starting Fall '25!! Looking forward to working with the amazing people there! Until then I'll be a postdoc at @ViennaNLP with Ben Roth, so reach out if you want to meet up while I'm over in Europe ✨

29

17

292

David Bau

@davidbau

1 year

Time to study #llama3 405b, but gosh it's big! Please retweet: if you have a great experiment but not enough GPU, here is an opportunity to apply for shared #NDIF research resources. Deadline July 30: https://t.co/uHN3BxaR6c You'll help @ndif_team test, we'll help you run 405b

ndif.us

NDIF is a research computing project that enables researchers and students to crack open the mysteries inside large-scale AI systems.

Jaden Fiotto-Kaufman

@jadenfk23

1 year

Frontier LLMs have capabilities that smaller AIs don't, but up to now there's been no way to crack them open. Now that #Llama3 405b is here, what's the most interesting experiment YOU want to do? 🚀 Apply at https://t.co/Zo3ALEt8td to make it happen and read for details 🧵⬇️

2

38

121

AK

@_akhaliq

1 year

NNsight and NDIF Democratizing Access to Foundation Model Internals The enormous scale of state-of-the-art foundation models has limited their accessibility to scientists, because customized experiments at large model sizes require costly hardware and complex engineering

2

25

72

Monica M Reddy

@monicamreddy

1 year

1/10 Excited to share our #BioNLP at #ACL2024NLP paper “Open (Clinical) LLMs are Sensitive to Instruction Phrasings", paper 📄 :

arxiv.org

Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This...

3

10

55