Seth Karten @sethkarten X Profile

Seth Karten

@sethkarten

Followers

1K

Following

7K

Media

94

Statuses

3K

Autonomous Agents | CS PhD @Princeton | Simulation @Waymo | Former @SCSatCMU @Amazon | @NSF GRFP Fellow

Princeton, NJ

Joined October 2012

Don't wanna be here? Send us removal request.

Seth Karten

@sethkarten

1 month

🚀 New preprint! .🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—all by optimizing utilities in‑context? Meet the LLM Economist ↓

2

9

23

Seth Karten

@sethkarten

1 day

RT @chijinML: Our department (ECE) at Princeton is hiring in AI this year!📢 Please consider applying and joining us:

0

16

0

Grok

@grok

26 days

Introducing Grok Imagine.

2K

4K

28K

Seth Karten

@sethkarten

4 days

Join our Discord for more info:

discord.com

https://pokeagent.github.io/ | 362 members

0

Seth Karten

@sethkarten

4 days

🚨 Hackathon Weekend! 🚨. Jumpstart your PokéAgent Challenge submission ahead of NeurIPS!. 📅 Sept 13–14 .✅ Leaderboards reset Sat 10AM EDT.🎙️ Lightning talks in LLMs, RL, and Pokemon.💬 Live Office hours.🏆 $2k in prizes

1

0

8

Seth Karten

@sethkarten

5 days

RT @sethkarten: 🎓 University students & AI researchers — push your Pokémon AI agents further!. The NeurIPS 2025 PokéAgent Challenge is offe….

0

5

0

Seth Karten

@sethkarten

6 days

I don’t know who needs to hear this but qwen is really bad at Pokémon.

4

0

18

Seth Karten

@sethkarten

13 days

RT @emollick: It seems like there is not enough of a policy response to the fact that, with 57M miles of data, Waymo’s autonomous vehicles….

0

691

0

Seth Karten

@sethkarten

14 days

This is amazing. Great for local inference and light training. I’m guessing $35k. ?.

Marco Mascorro

@Mascobot

14 days

🚨 New: We built @a16z's personal GPU AI Workstation Founders Edition. - 4x NVIDIA RTX 6000 PRO Blackwell Max-Q (384GB total VRAM).- 8TB of NVMe PCIe 5.0 storage.- AMD Threadripper PRO 7975WX (32 cores, 64 threads).- 256GB ECC DDR5 RAM.- 1650Watts at peak (runs on a standard

1

0

6

Seth Karten

@sethkarten

22 days

🎓 University students & AI researchers — push your Pokémon AI agents further!. The NeurIPS 2025 PokéAgent Challenge is offering compute credits, courtesy of our sponsor Google DeepMind, to help you train bigger models & run more experiments. 📌 To apply:.1️⃣ Make a submission to.

0

5

39

Seth Karten

@sethkarten

23 days

I am very excited to see this deep dive into Gemini Plays Pokemon! This is a great effort and shows the sheer complexity of deploying LLM agents and scaffolding at scale and long contexts.

Joel Z

@TheCodeOfJoel

23 days

I wrote up the making-of for Gemini Plays Pokémon: how I designed the scaffold so Gemini 2.5 Pro could handle a long-horizon game, what failed, and the lessons that made it work. Full post:

0

1

7

Seth Karten

@sethkarten

1 month

arxiv.org

We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower...

0

1

Seth Karten

@sethkarten

1 month

This is exactly a perspective that we exploit in the LLM Economist. By grounding personas in intrinsic utility functions, we bound the degree to which personas can go haywire.

Anthropic

@AnthropicAI

1 month

New Anthropic research: Persona vectors. Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.

1

4

Seth Karten

@sethkarten

1 month

I vibe-coded a hidden 90s-style easter egg to my website. I used Huggingface’s Anycoder to prototype the retro design direction—think neon text and CRT glow. If you can guess the secret code, you can view the final form on my website. Sneak peek below.

4

1

16

Seth Karten

@sethkarten

1 month

Honored by @rohanpaul_ai‘s summary of the LLM Economist!.

Rohan Paul

@rohanpaul_ai

1 month

The paper builds a small simulated economy with 100 language‑model “workers” and one language‑model “planner”, then lets that planner tweak 7 income‑tax brackets every 128 steps until the society’s average happiness ends up about 90% higher than under the current US code. In

1

0

6

Seth Karten

@sethkarten

1 month

LLM Economist creates optimal tax policy <—> TaxCalcBench does your taxes.AI Tax Civilization: Who is building this? .

arxiv.org

We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower...

Michael R. Bock

@michaelrbock

1 month

1/ Can AI file your taxes? Not yet. We tested the latest frontier models and the results were full of catastrophic errors. Letting AI do your taxes would mean IRS rejections, audits, and penalties:

1

2

6

Seth Karten

@sethkarten

1 month

Pushing the frontier is good, but I like speedrunning. How far could your agent do in 6 hours?

Alex Zhang

@a1zhang

1 month

LM reasoning benchmark idea: have it beat a Hardcore Nuzlocke run of Pokémon Run & Bun or a Kaizo ROM hack! Give it access to search online, use damage calculators, etc. People spend literally hundreds of hours meticulously planning battles, managing their available mons, etc.

0

1

11

Seth Karten

@sethkarten

1 month

RT @sethkarten: 🚀 New preprint! .🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—al….

0

9

0

Seth Karten

@sethkarten

1 month

Thanks @_akhaliq for the interest in the LLM Economist! For details on synthetic nudging + democratic alignment, check the full thread ↘️ .

AK

@_akhaliq

1 month

LLM Economist. Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra

0

2

Seth Karten

@sethkarten

1 month

Special thanks to my collaborators @WenzheLiTHU @Hanry65960814 Samuel Kleiner @yubai01 @chijinML and to @coop_ai for great feedback at the 2024 summer school.

0

2

Seth Karten

@sethkarten

1 month

A sandbox for mechanism design: iterate incentive schemes inside large‑scale simulacra before touching the real world. Thoughts? RT if you think generative agents can design policy. 🔄❤️.

1

0

1

Seth Karten

@sethkarten

1 month

Democratic alignment: in a special case, periodic citizen voting can fire the planner. Leader turnover keeps welfare high and prevents policy drift—central nudging plus decentralized oversight in one sandbox.

1

0

1