sethkarten Profile Banner
Seth Karten Profile
Seth Karten

@sethkarten

Followers
1K
Following
7K
Media
94
Statuses
3K

Autonomous Agents | CS PhD @Princeton | Simulation @Waymo | Former @SCSatCMU @Amazon | @NSF GRFP Fellow

Princeton, NJ
Joined October 2012
Don't wanna be here? Send us removal request.
@sethkarten
Seth Karten
1 month
🚀 New preprint! .🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—all by optimizing utilities in‑context? Meet the LLM Economist ↓
Tweet media one
2
9
23
@sethkarten
Seth Karten
1 day
RT @chijinML: Our department (ECE) at Princeton is hiring in AI this year!📢 Please consider applying and joining us:
0
16
0
@grok
Grok
26 days
Introducing Grok Imagine.
2K
4K
28K
@sethkarten
Seth Karten
4 days
Join our Discord for more info:
discord.com
https://pokeagent.github.io/ | 362 members
0
0
0
@sethkarten
Seth Karten
4 days
🚨 Hackathon Weekend! 🚨. Jumpstart your PokéAgent Challenge submission ahead of NeurIPS!. 📅 Sept 13–14 .✅ Leaderboards reset Sat 10AM EDT.🎙️ Lightning talks in LLMs, RL, and Pokemon.💬 Live Office hours.🏆 $2k in prizes
Tweet media one
1
0
8
@sethkarten
Seth Karten
5 days
RT @sethkarten: 🎓 University students & AI researchers — push your Pokémon AI agents further!. The NeurIPS 2025 PokéAgent Challenge is offe….
0
5
0
@sethkarten
Seth Karten
6 days
I don’t know who needs to hear this but qwen is really bad at Pokémon.
4
0
18
@sethkarten
Seth Karten
13 days
RT @emollick: It seems like there is not enough of a policy response to the fact that, with 57M miles of data, Waymo’s autonomous vehicles….
0
691
0
@sethkarten
Seth Karten
14 days
This is amazing. Great for local inference and light training. I’m guessing $35k. ?.
@Mascobot
Marco Mascorro
14 days
🚨 New: We built @a16z's personal GPU AI Workstation Founders Edition. - 4x NVIDIA RTX 6000 PRO Blackwell Max-Q (384GB total VRAM).- 8TB of NVMe PCIe 5.0 storage.- AMD Threadripper PRO 7975WX (32 cores, 64 threads).- 256GB ECC DDR5 RAM.- 1650Watts at peak (runs on a standard
Tweet media one
1
0
6
@sethkarten
Seth Karten
22 days
🎓 University students & AI researchers — push your Pokémon AI agents further!. The NeurIPS 2025 PokéAgent Challenge is offering compute credits, courtesy of our sponsor Google DeepMind, to help you train bigger models & run more experiments. 📌 To apply:.1️⃣ Make a submission to.
0
5
39
@sethkarten
Seth Karten
23 days
I am very excited to see this deep dive into Gemini Plays Pokemon! This is a great effort and shows the sheer complexity of deploying LLM agents and scaffolding at scale and long contexts.
@TheCodeOfJoel
Joel Z
23 days
I wrote up the making-of for Gemini Plays Pokémon: how I designed the scaffold so Gemini 2.5 Pro could handle a long-horizon game, what failed, and the lessons that made it work. Full post:
0
1
7
@sethkarten
Seth Karten
1 month
This is exactly a perspective that we exploit in the LLM Economist. By grounding personas in intrinsic utility functions, we bound the degree to which personas can go haywire.
@AnthropicAI
Anthropic
1 month
New Anthropic research: Persona vectors. Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.
Tweet media one
1
1
4
@sethkarten
Seth Karten
1 month
I vibe-coded a hidden 90s-style easter egg to my website. I used Huggingface’s Anycoder to prototype the retro design direction—think neon text and CRT glow. If you can guess the secret code, you can view the final form on my website. Sneak peek below.
Tweet media one
4
1
16
@sethkarten
Seth Karten
1 month
Honored by @rohanpaul_ai‘s summary of the LLM Economist!.
@rohanpaul_ai
Rohan Paul
1 month
The paper builds a small simulated economy with 100 language‑model “workers” and one language‑model “planner”, then lets that planner tweak 7 income‑tax brackets every 128 steps until the society’s average happiness ends up about 90% higher than under the current US code. In
Tweet media one
1
0
6
@sethkarten
Seth Karten
1 month
LLM Economist creates optimal tax policy <—> TaxCalcBench does your taxes.AI Tax Civilization: Who is building this? .
Tweet card summary image
arxiv.org
We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower...
@michaelrbock
Michael R. Bock
1 month
1/ Can AI file your taxes? Not yet. We tested the latest frontier models and the results were full of catastrophic errors. Letting AI do your taxes would mean IRS rejections, audits, and penalties:
Tweet media one
1
2
6
@sethkarten
Seth Karten
1 month
Pushing the frontier is good, but I like speedrunning. How far could your agent do in 6 hours?
@a1zhang
Alex Zhang
1 month
LM reasoning benchmark idea: have it beat a Hardcore Nuzlocke run of Pokémon Run & Bun or a Kaizo ROM hack! Give it access to search online, use damage calculators, etc. People spend literally hundreds of hours meticulously planning battles, managing their available mons, etc.
Tweet media one
0
1
11
@sethkarten
Seth Karten
1 month
RT @sethkarten: 🚀 New preprint! .🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—al….
0
9
0
@sethkarten
Seth Karten
1 month
Thanks @_akhaliq for the interest in the LLM Economist! For details on synthetic nudging + democratic alignment, check the full thread ↘️ .
@_akhaliq
AK
1 month
LLM Economist. Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
Tweet media one
0
0
2
@sethkarten
Seth Karten
1 month
Special thanks to my collaborators @WenzheLiTHU @Hanry65960814 Samuel Kleiner @yubai01 @chijinML and to @coop_ai for great feedback at the 2024 summer school.
0
0
2
@sethkarten
Seth Karten
1 month
A sandbox for mechanism design: iterate incentive schemes inside large‑scale simulacra before touching the real world. Thoughts? RT if you think generative agents can design policy. 🔄❤️.
1
0
1
@sethkarten
Seth Karten
1 month
Democratic alignment: in a special case, periodic citizen voting can fire the planner. Leader turnover keeps welfare high and prevents policy drift—central nudging plus decentralized oversight in one sandbox.
Tweet media one
1
0
1