
16x Eval
@16xEval
Followers
118
Following
18
Media
1
Statuses
20
Effortlessly evaluate prompts and models. 16x Eval is your personal workspace for prompt engineering.
Joined April 2025
RT @paradite_: I just realized that with @16xEval, I'm essentially building the postman / insomnia for LLMs. Should I change my tagline?.ht….
0
1
0
RT @paradite_: Deepseek R1 0528 Qwen3 8B (deepseek/deepseek-r1-0528-qwen3-8b) via @OpenRouterAI evaluation results:. Coding:.- Similar perf….
0
21
0
RT @paradite_: You need to install dependencies and write code just to create and run evaluations? With @16xEval you just click buttons and….
0
1
0
RT @paradite_: Finished testing Claude Opus 4 and Claude Sonnet 4 on my personal eval set. I am VERY impressed. Claude Opus 4 absolutely d….
0
23
0
RT @paradite_: Chill Saturday afternoon running some evals. Finally finished swapping ai-sdk to my own send-prompt in @16xEval, so that I c….
0
2
0
RT @paradite_: It takes skill and patience to generate a decent blog post draft using AI. @16xEval is not perfect, but with the right promp….
0
1
0
RT @paradite_: 16x Eval Update - 0.0.37. - Added support for system prompts.- Added advanced settings for more configuration options.- Adde….
0
1
0
RT @paradite_: Mistral Medium 3 eval result via @OpenRouterAI:. Overall the model is solid across coding and writing, placing it among top….
0
9
0
RT @paradite_: New Gemini 2.5 Pro Preview (gemini-2.5-pro-preview-05-06) eval result: . For simple Next.js TODO coding task:.- The new mod….
0
1
0
RT @paradite_: Made my first sale of @16xEval outside the existing customers of @16xPrompt. So much work to do, but I'm very excited and o….
0
1
0
RT @paradite_: Qwen3 235B A22B (thinking, via @OpenRouterAI) evaluation results on my personal eval set:. Next.js TODO add new feature - 8.….
0
5
0
RT @paradite_: I just had an insane realization that the Gemini 2.5 Pro's 1m context window is actually a "scam". It is actually only 540k….
0
1
0
RT @paradite_: I am super excited to announce my new product: 16x Eval: Effortlessly evaluate prompts and models. 👉 .
0
1
0