Instead of finding the perfect prompt for an LLM (let's think step by step), you can ask LLMs to critique their outputs and immediately fix their own mistakes. Here's a fun example: Tweet added by Eric Jang @ericjang11

Eric Jang

1 year

Instead of finding the perfect prompt for an LLM (let's think step by step), you can ask LLMs to critique their outputs and immediately fix their own mistakes. Here's a fun example:

61

224

2K

Eric Jang

@ericjang11

1 year

I saw @awjuliani 's tweet that LLMs cannot generate a non-rhyming poem. Indeed, GPT-4 does not do it even if I ask it to think carefully

Arthur Juliani

@awjuliani

1 year

Still can't not rhyme.

3

1

24

3

6

77

Eric Jang

@ericjang11

1 year

@awjuliani Now you ask GPT-4 if it met the assignment, at which point it apologizes and generates a valid non-rhyming poem! full marks

19

12

326

Eric Jang

@ericjang11

1 year

@awjuliani h/t @avisingh599 for pointing me out the Reflexion paper, which gets at this idea. It's absolutely wild to me that LLMs are general enough that they can critique their own outputs in a sensible way

6

14

307

Eric Jang

@ericjang11

1 year

@awjuliani @avisingh599 Interestingly enough, GPT-3.5 is incapable of such self-critique, at least for this assignment. It seems to be an emergent capability only present in GPT-4

15

14

225

Eric Jang

@ericjang11

1 year

@awjuliani @avisingh599 The implications are substantial. Instead of clever "prefix prompt engineering", we can now consider a "postfix prompt engineering", which encourages LLMs to find corrections and inconsistencies within prior generated solutions.

4

25

312

Eric Jang

@ericjang11

1 year

@awjuliani @avisingh599 After any generated output, just append "did the generated output do what the user asked?" and the LLM becomes a "minimal policy improvement operator" for itself

AlphaGo Zero: Minimal Policy Improvement, Expectation Propagation and other Connections

This is a post about the new reinforcement learning technique that enables AlphaGo Zero to learn Go from scratch via self-play. The paper has been out for a week I guess it's now considered old -...

www.inference.vc

4

20

177

Eric Jang

@ericjang11

1 year

@awjuliani @avisingh599 Maybe it is possible to apply a critique to the critique recursively, i.e. append "is the critique logically consistent with the original request?" GPT-3.5 seems to be rather biased towards self-congratulatory optimism

8

5

70

Eric Jang

@ericjang11

1 year

Wrote a quick blog post about it here. Would be curious to see what examples it can verify & correct, and what examples it can only verify, and what tasks it fails to verify

Can LLMs Critique and Iterate on Their Own Outputs?

Avi Singh told me yesterday about a recent arxiv preprint, Reflexion, that proposes the following idea: use a LLM to examine whether the output of another generative model is “on the right track”...

evjang.com

5

12

110

Eric Jang

@ericjang11

1 year

Also this self-critique trick seems to be able to handle @ylecun 's gear puzzle!

Ankesh Anand

@ankesh_anand

1 year

@ericjang11 @stanislavfort @ylecun works if you ask it to critique itself :)

4

0

26

5

2

56

Eric Jang

@ericjang11

1 year

@character_ai 's c.ai @1 .2 does a pretty good job without further prefix or postfix prompting

3

0

16

Chris Wood

@C_H_Wood

1 year

@ericjang11 You should try asking it what prompt it would create to get itself (or I’ve found GPT3 believes that only GPT2 exists so I use that when I do this) to do what you want. Then you show it the results and share your critique and ask it how to change the prompt to improve the output

1

0

3