Stella Biderman Profile
Stella Biderman

@BlancheMinerva

Followers
17K
Following
11K
Media
624
Statuses
13K

Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/her

Joined May 2019
Don't wanna be here? Send us removal request.
@BlancheMinerva
Stella Biderman
1 month
Two years in the making, we finally have 8 TB of openly licensed data with document-level metadata for authorship attribution, licensing details, links to original copies, and more. Hugely proud of the entire team.
@AiEleuther
EleutherAI
1 month
Can you train a performant language models without using unlicensed text?. We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1&2
Tweet media one
18
67
554
@BlancheMinerva
Stella Biderman
7 days
RT @EFF: Section 230 is good, actually. (That’s it. That’s the tweet.)
0
6
0
@BlancheMinerva
Stella Biderman
9 days
@AiEleuther But many orgs – @AiEleuther @llm360 DCLM @allen_ai @CMU_AI @StanfordHAI @mbzuai and more – kept proving that wrong so they need to keep raising the capital expenditure to count as "meaningful." And we'll keep meeting it.
0
0
49
@BlancheMinerva
Stella Biderman
9 days
The same thing happened before @AiEleuther started training models. People at many companies kept telling academics and non-profits "oh you'll never be able to train a model like GPT-3," "leave model training to companies, just study the behavior of the things we release.".
1
1
31
@BlancheMinerva
Stella Biderman
9 days
Take the LLaMA 3 paper for another example. I know (from personal experience and talking to others) that many authors of this paper endorse the above view. And yet, not a single model in their scaling laws plots is that large! (7B / 1T = 4.2e22 FLOP)
Tweet media one
Tweet media two
Tweet media three
2
1
20
@BlancheMinerva
Stella Biderman
9 days
The following two plots from the GPT-4 paper doesn't include any actual numbers, but everyone serious in this space knows the # params, # tokens, and and architecture of the original GPT-4 because OpenAI leaks like a sieve. These plots don't support the claim.
Tweet media one
Tweet media two
1
0
33
@BlancheMinerva
Stella Biderman
9 days
It's pretty weird how researchers at hyperscalers keep talking about how models smaller than 7B/1T are meaningless and that you need tons of compute to make reliable extrapolations and then write papers claiming the exact opposite.
6
12
311
@BlancheMinerva
Stella Biderman
15 days
We haven't done the best job promoting it, but the @AiEleuther YouTube channel is a goldmine of AI content.
@AiEleuther
EleutherAI
15 days
If you can't make it, no problem! All of our reading groups and speaker series upload to our YouTube. We have over 100 hours of content on topics from ML Scalability and Performance to Functional Analysis to podcasts and interviews featuring our team.
Tweet media one
3
10
133
@BlancheMinerva
Stella Biderman
15 days
RT @AiEleuther: We are launching a new speaker series at EleutherAI, focused on promoting recent research by our team and community members….
0
21
0
@BlancheMinerva
Stella Biderman
24 days
LLMs are cool and all, but if you didn't bother to write something, why should I bother to read it?. (Yes this is a subtweet of whatever the first thing that popped into your head was).
21
4
156
@BlancheMinerva
Stella Biderman
24 days
Note: I deliberately don't identify the errors in the paper in the OP; I want people to practice critical reading. I may post some in a week or so.
2
0
33
@BlancheMinerva
Stella Biderman
24 days
Someone should probably critically analyze the abilities of models to do scientific work. .
@gson_AI
arlo_son
2 months
#NLProc.AI Co-Scientists 🤖 can generate ideas, but can they spot mistakes? (not yet! 🚫). In my recent paper, we introduce SPOT, a dataset of STEM manuscripts (math, materials science, chemistry, physics, etc), annotated with real errors. SOTA models like o3, gemini-2.5-pro
Tweet media one
1
0
41
@BlancheMinerva
Stella Biderman
24 days
A good warning lesson on using AIs to write papers: this alleged response to the (dubious) "Illusion of Thinking" paper is full of mathematical errors.
10
47
439
@BlancheMinerva
Stella Biderman
24 days
A bunch of papers suggest that if X and Y are independent tasks, we might expect to see "emergent" behavior on "X and Y" or some task that requires first X and then Y. I'm really surprised I can't find any papers that dig into this; it's usually a side comment. Do you know any?.
2
0
13
@BlancheMinerva
Stella Biderman
25 days
Extremely exciting to see this finally come out. A game-changer for malware detection and analysis.
@EdwardRaffML
Edward Raff
25 days
Lead by @rjjoyce8 , #EMBER24 has arrived @kdd_news #KDD25, the best, most open, and versatile malware detection benchmark ever! w/ @rjzak @mrphilroth @drhyrum & others, let's try to barely summarize all the new things you can do now! @BoozAllen @CrowdStrike @Cisco 🧵👇
Tweet media one
0
1
10
@BlancheMinerva
Stella Biderman
26 days
Come join us!.
@LChoshen
Leshem (Legend) Choshen 🤖🤗 @ICML @ACL
26 days
🚀 Technical practitioners & grads — join to build an LLM evaluation hub!. Infra Goals:.🔧 Share evaluation outputs & params.📊 Query results across experiments. Perfect for 🧰 hands-on folks ready to build tools the whole community can use. Join the EvalEval Coalition here 👇
Tweet media one
1
3
15
@BlancheMinerva
Stella Biderman
1 month
This is incredibly good science. Read the entire thread, I beg you.
@ryanmart3n
Ryan Marten
1 month
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data
Tweet media one
5
16
157
@BlancheMinerva
Stella Biderman
1 month
RT @lschmidt3: Very excited to finally release our paper for OpenThoughts!. After DataComp and DCLM, this is the third large open dataset m….
0
213
0
@BlancheMinerva
Stella Biderman
1 month
RT @storytracer: Common Pile v0.1 is only the beginning. At @AiEleuther we will publish open datasets on a regular basis from now on, using….
0
15
0
@BlancheMinerva
Stella Biderman
1 month
RT @AiEleuther: What do we mean by "openly licensed" data? Following the lead of orgs like @publicknowledge @Wikimedia @creativecommons we….
0
2
0