Kevin Ellis @ellisk_kellis X Profile

Kevin Ellis

@ellisk_kellis

Followers

1K

Following

90

Media

17

Statuses

38

Cornell Computer Science, Assistant Professor. Program synthesis, AI

Ithaca, New York

Joined September 2021

Don't wanna be here? Send us removal request.

Kevin Ellis

@ellisk_kellis

26 days

New paper: World models + Program synthesis by @topwasu.1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code.2. Learns new environments from minutes of experience.3. Positive score on Montezuma's Revenge.4. Compositional generalization to new environments

16

104

563

Kevin Ellis

@ellisk_kellis

14 days

RT @justintchiu: Are code agents good at software design, ie building general and reusable code?.We present Librarian, a new refactoring me….

0

22

0

Kevin Ellis

@ellisk_kellis

26 days

Thank you OCAtari team, whose work was super important for us!.Quentin Delfosse, @BluemlJ , Bjarne Gregori, Sebastian Sztwiertnia, @kerstingAIML. [9/n].

3

1

22

Kevin Ellis

@ellisk_kellis

26 days

w/ collaborators @topwasu, @yichao_liang, @tanghao95, Marta Kryven, and @adrian_weller. Project page: Arxiv: [8/n]

1

2

31

Kevin Ellis

@ellisk_kellis

26 days

Limitations:.1. Object-centric state (bounding boxes).2. Uses a demonstration trajectory.3. Planning is still hard!. Not limitations:.1. Partial observability / hidden state.2. Stochasticity. [7/n].

2

0

15

Kevin Ellis

@ellisk_kellis

26 days

Compositional generalization: we test on alternate version (Alt) of Atari's Pong and Montezuma's Revenge (MR) which recombine and rearrange the objects in the demonstration. Pong-Alt is Pong with three balls and three enemies. Montezuma's Revenge Alt has a map layout similar to

1

0

15

Kevin Ellis

@ellisk_kellis

26 days

After using short demonstrations (<1 min) to learn world models for Pong and Montezuma's Revenge (MR), we embed the learned world model in a model-based planning agent, PoE-World + Planner. The agent is order-of-magnitude more sample-efficient than model-free RL and can handle

1

0

18

Kevin Ellis

@ellisk_kellis

26 days

The key technical idea is to decompose the problem of learning a world program into learning hundreds of small programs. Each of these learned programs encodes a different causal law, which we probabilistically aggregate to predict future observations. This makes world knowledge

2

0

22

Kevin Ellis

@ellisk_kellis

26 days

We break the grid-world barrier with PoE-World, a program synthesis world modeling method which represents a world model as an exponentially-weighted product of programmatic experts synthesized by LLMs. [3/n]

2

1

24

Kevin Ellis

@ellisk_kellis

26 days

Learning how the world works is central to building agents that quickly adapt to new environments. Neural network world models are highly flexible but need big training data, and don't quickly update their knowledge from sparse observations. Program world models can generalize.

1

18

Kevin Ellis

@ellisk_kellis

7 months

Last, Hao Tang's other paper: REx! This paper happened because we needed to scale WorldCoder to generate 250+ line programs. It uses Thompson Sampling to explore a tree of potential programs that are iteratively improved by LLMs. REx is a simple algorithm (~10 LoC) that

0

2

11

Kevin Ellis

@ellisk_kellis

7 months

Wen-Ding Li trains LLMs to synthesize code from *only* input-outputs: No natural language. This works for graphics code, functional programs, and FlashFill, providing the foundation for our recent ARC paper. It's also our first step toward bringing

1

13

Kevin Ellis

@ellisk_kellis

7 months

Doing Experiments and Revising Rules (Top Piriyakulkij+Cassidy Langenfeld):.Monte Carlo methods for tracking natural language belief states, & Bayes-optimal experiments. Tested on:.1. Blickets, i.e. from Alison Gopnik's talk.2. Zendo, i.e. blickets for grownups. Findings:.1.

1

0

4

Kevin Ellis

@ellisk_kellis

7 months

WorldCoder (Hao Tang+Darren Key) interacts with an environment, and writes Python code to model its transition function. It explores the environment by creating reward functions its world model thinks are feasible, then planning to achieve them. Learning

1

3

8

Kevin Ellis

@ellisk_kellis

7 months

My student's papers at NeurIPS -->.1. World models & program synthesis @tanghao95 .2. Having LLMs experiment in human-like ways @topwasu .3. The "pilot study" for our recent ARC paper @xu3kev .4. LLM tree search & Thompson sampling @tanghao95 .🧵.

1

22

Kevin Ellis

@ellisk_kellis

7 months

WorldCoder learns how an environment works by interacting with it, and programming a world model in Python. It explores the environment by optimistically inventing reward functions its world model thinks feasible. Program learning can be very sample efficient (orders of.

0

4

Kevin Ellis

@ellisk_kellis

7 months

RT @alexanderklew: If you're interested in a PhD at the intersection of machine learning and programming languages, consider Yale CS! . We….

0

44

0

Kevin Ellis

@ellisk_kellis

7 months

Thank you, François, Mike, & team, for the ARC challenge. It has been a durable source of inspiration, and brings fresh ideas to AI. The paper award first authors are Keya Hu (applying to PhDs @HuLillian39250) and Wen-Ding Li (at NeurIPS hunting for industry gigs @xu3kev).

François Chollet

@fchollet

7 months

Today we're announcing the winners of ARC Prize 2024. We're also publishing an extensive technical report on what we learned from the competition (link in the next tweet). The state-of-the-art went from 33% to 55.5%, the largest single-year increase we've seen since 2020. The.

2

14

58