Daphne Cornelisse
@daphne_cor
Followers
1K
Following
3K
Media
57
Statuses
565
Ph.D. student @nyuniversity • Building human-like agents 🦋 https://t.co/BhKiCuu03w
NYC
Joined September 2017
Rapid RL experimentation is great. But how do you catch silent errors before they slip by? 🐛🪲🐞 In this post, I share tools and habits that have helped me move quickly from idea to result without sacrificing reliability. Link below
2
18
169
The difference with this new type of software (AI / RL systems) is that we specify the objective rather than the procedure, creating a much larger solution space that can include undesirable behaviors (3/3)
0
0
5
While there’s some nuance here, at the end of the day, it is similar to accidentally writing an infinite loop: we don’t say the compiler is “sabotaging” us, it runs exactly what we wrote (2/3)
1
0
4
Describing a model as “evil” anthropomorphizes an RL agent in an unhelpful way. The agent is just optimizing the objective it’s given. If its resulting behavior is undesirable, isn’t that ultimately a design oversight by the human? (1/3)
New Anthropic research: Natural emergent misalignment from reward hacking in production RL. “Reward hacking” is where models learn to cheat on tasks they’re given during training. Our new study finds that the consequences of reward hacking, if unmitigated, can be very serious.
3
1
12
2025 is the year of open-endedness.
Excited to announce our MIT Press book “Neuroevolution: Harnessing Creativity in AI Agent Design” by Sebastian Risi (@risi1979), Yujin Tang (@yujin_tang), Risto Miikkulainen, and myself. We explore decades of work on evolving intelligent agents and shows how neuroevolution can
3
7
88
We should have a certification like "organic" or "biodynamic" for writing: writing that was completely untouched by an LLM. Please I only want to read that writing.
6
6
46
🚀 Excited to share a new preprint, accepted as a spotlight at #NeurIPS2025! Humans are imperfect decision-makers, and autonomous systems should understand how we deviate from idealized rationality Our paper aims to address this! 👀🧠✨ https://t.co/jOLXBdELTt a 🧵⤵️
2
13
51
Anyone else find it enjoyable to view diffs on GitHub? There's something aesthetic about seeing old and new code laid out side by side like that. I used to assume everyone felt this way, but judging by the reaction I got last time, not everyone experiences this.. oh well
1
0
23
We’re hiring! @sucholutsky and I are seeking a postdoc and RA for a project on trust in AI systems with folks at NYU, Princeton, BU, and Cornell Positions open until filled. Apply soon! Please share 🔁 postdoc: https://t.co/iARVtYrMLN RA:
3
48
217
Wrote a short post on the state of human behavior modeling in driving sims after our workshop last week. Link below 👇
1
2
9
Ran 10k this morning seeing nothing much besides water, grasslands, sheep and the occasional cyclist. It’s nice to be home for a bit
0
0
11
‘Human Data is the Cherry of Human-AI Interaction, Not the Cake’ - with @EugeneVinitsky (@nyutandon) -
1
3
3
When building RL envs, it’s very easy to get caught up building the coolest most feature intense version and then start training. This is often a mistake. Build the simplest version of your env where you can establish a baseline. Get a nice training curve, then iterate.
1
3
22
What are examples of good k-shot adaptation benchmarks? I’m looking for environments where agents must adapt to new dynamics and where the metrics are interpretable
1
0
2
As an early-to-mid PhD student, I really enjoyed reading this. It is full of useful advice, but I especially appreciated the reframing idea: "for your next project, ask yourself, ‘What new question will this enable someone to ask?"
So You Want to Be an Academic? A couple of years into your PhD, but wondering: "Am I doing this right?" Most of the advice is aimed at graduating students. But there's far less for junior folks who are still finding their academic path. My candid takes:
1
2
28
I'm shocked at how poorly this is advertised, so here's a PSA: NSF has a GRFP-like program specifically for computing disciplines called CISE. The program provides the same 3 years of PhD funding PLUS a year-long mentorship program for the application cycle.
1
15
50