sebgehr Profile Banner
Sebastian Gehrmann Profile
Sebastian Gehrmann

@sebgehr

Followers
6K
Following
5K
Media
119
Statuses
2K

Head of Responsible AI, CTO office, @Bloomberg. (he/him) Formerly LLMs @ Google Brain / PhD @ Harvard. views my own

New York City
Joined November 2013
Don't wanna be here? Send us removal request.
@sebgehr
Sebastian Gehrmann
5 years
Introducing 💎GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. We are organizing shared tasks for our ACL 2021 workshop - Please consider participating! Website: https://t.co/TAs4F40mga Paper: https://t.co/VWdcdNv6iu #NLProc 🧵1/X
3
120
328
@sebgehr
Sebastian Gehrmann
5 days
We are looking for Ph.D. fellows! Bloomberg will fund research and mentor outstanding graduate students across a wide range of Computer Science and AI topics. If you want to be considered, please apply by December 14. Link with information and application details below 👇
6
94
379
@sebgehr
Sebastian Gehrmann
7 days
We are looking for excellent PhD students across many topics for our Bloomberg CTO AI Research internship next summer. Link to apply below.
6
27
202
@CelsiusOfficial
CELSIUS Energy Drink
2 months
Hydrate. Hustle. GO! CELSIUS HYDRATION - The ultimate hydration for every move. CELSIUS. LIVE. FIT. GO!
259
386
5K
@adinamwilliams
Adina Williams
7 days
FAIR is hiring interns for 2026! If you're interested in a stint doing fundamental AI research with us @AIatMeta, interested students enrolled in a PhD program can apply below👇: https://t.co/PrG9L625bY
Tweet card summary image
metacareers.com
Meta's mission is to build the future of human connection and the technology that makes it possible.
16
46
434
@sebgehr
Sebastian Gehrmann
19 days
To my friends at meta impacted by the layoffs, we are hiring in London, NYC, and Toronto. Link with jobs and application info below 👇🏼
7
13
190
@JustHackingHQ
Just Hacking Training (JHT)
3 days
Black Friday Comes Early 🦃 Code "BlackFriday25" active NOW for 25% off ALL courses on Just Hacking Training including Constructing Defense 2025! Excludes already discounted Bundles. Expires Nov 30 at Midnight ET.
4
16
31
@sebgehr
Sebastian Gehrmann
27 days
Hey @iclr_conf what is happening with review assignments? I got 5 instead of 3 assigned and most of them were papers I specifically excluded during bidding. I have no business reviewing bio papers which is all I got!
0
0
12
@sebgehr
Sebastian Gehrmann
1 month
That's why my opinion paper provides an extensive survey or areas in which these two research areas intersect and should learn from another. Let evals help build more reliable and trustworthy AI.
0
0
1
@sebgehr
Sebastian Gehrmann
1 month
Moreover, this divide is particularly prominent in academic research while the similarities are well-understood within teams developing LLMs. Bridging this division will be crucial to advance models in the open.
1
0
2
@sebgehr
Sebastian Gehrmann
1 month
This is where the "trench coat" comes in. I argue that reward models are just a type of learned metric. This means that AI alignment may overlook decades of lessons from the world of evaluation metrics.
1
0
2
@ZooseLLC
Zoose®
10 hours
New today: What We Saw, Why It Happened and What Comes Next. Link in Bio.
0
1
2
@sebgehr
Sebastian Gehrmann
1 month
So why do these fields operate in parallel worlds? A citation analysis finds a clear disciplinary divide. The two communities build on different lines of work, publish in different venues, and rarely engage with one another.
1
0
1
@sebgehr
Sebastian Gehrmann
1 month
For decades, fields like machine translation have developed automatic metrics to evaluate AI-generated text. More recently, reward models trained on human preferences have become the standard for aligning large language models. Link to paper:
Tweet card summary image
arxiv.org
The emergence of reinforcement learning in post-training of large language models has sparked significant interest in reward models. Reward models assess the quality of sampled model outputs to...
1
0
4
@sebgehr
Sebastian Gehrmann
1 month
Why does research on evaluation metrics and reward models rarely inform each other? In my new paper, "Reward Models are Metrics in a Trench Coat," I discuss how we are missing a big opportunity by keeping them separate.
7
9
85
@sebgehr
Sebastian Gehrmann
1 month
I guess I was a bit early. Let's try this one again :D
@sebgehr
Sebastian Gehrmann
2 months
If Adam's going to drop out he's way too late. Transformer models used Adam with dropout 8 years ago.
0
0
2
@securityonion
Security Onion
4 days
Security Onion is not just NSM anymore! We started in 2008 as a Network Security Monitoring platform, but we've added so many features over the years! - endpoint visibility - log management - case management - deception - MCP - AI - and more! Check it out!
0
6
12
@sebgehr
Sebastian Gehrmann
2 months
This was a fun talk. As always, the conclusion is - Evaluate your AI systems in the context they are deployed in
@TechAtBloomberg
Tech At Bloomberg
2 months
To open today's 11th Annual Bloomberg-Columbia #MachineLearning in Finance Conference, @Bloomberg's Head of #ResponsibleAI, @sebgehr, is exploring what it means for an #AI system to be safe in the context of financial services https://t.co/HmMeijZgYd #AI #ML #GenAI #MLinFinance
1
0
8
@sebgehr
Sebastian Gehrmann
2 months
Why is my X full of people saying evals are dead? Do they just not know the latency, the NPS, or any feedback about their product? How do they make changes? Serious question, kinda confused
0
0
3
@sebgehr
Sebastian Gehrmann
2 months
If Adam's going to drop out he's way too late. Transformer models used Adam with dropout 8 years ago.
0
0
3
@sebgehr
Sebastian Gehrmann
3 months
Don't you love hallucinations in product announcements... Based on passengers, CLT, MCO, MIA, and PHX are all busier than SEA + SFO.
@satyanadella
Satya Nadella
3 months
I just love this. The new =COPILOT() function in Excel lets you analyze, generate content, and brainstorm directly in the grid.
1
2
8
@SuperLuckeee
Michael & Esther
2 days
20 steps to turn $2000 into $1,000,000+ in 2026 (if you work full-time) 1. Open 2 brokerage accounts 1 for swing trading and 1 for scalping. 2. Put $1000 into it each of them. 3. Use $1000 for scalping so take only 0-3 trades a day max (always, do not break this rule) 3. Use
367
173
2K
@sebgehr
Sebastian Gehrmann
3 months
AOL-Time-Warner-Pepsico-Viacom-Halliburton-Skynet-Toyota-Trader-Joe's
@aidangomez
Aidan Gomez
3 months
Cohere intends to acquire Perplexity immediately after their acquisitions of TikTok and Google Chrome. We will continue to monitor the progress of those deals closely so we can submit our term sheet upon completion.
1
0
3