@KentGruber
Kent Gruber
3 years
Spent some time learning about @OpenAI evals tonight, and submitted a Vigenère decryption evaluation. https://t.co/Xh0CkFzhfe
1
0
0

Replies

@KentGruber
Kent Gruber
3 years
Currently, it has an accuracy of ~0.09 with GPT-3.5-Turbo. It can answer the example from the Wikipedia page, but struggles with the few others I generated. Super curious if GPT-4 will handle this better. https://t.co/r1mpxQd0lc
1
0
0
@KentGruber
Kent Gruber
3 years
I found the JSONL format for the eval data to be a little cumbersome. Like, no VSCode syntax highlighting by default, for example. Each line can be very long, so they can be annoying to read or edit. So for fun, I decided to experiment with using HCL: https://t.co/oLaBXlKhkN
0
0
0