Explore tweets tagged as #LLMfailures
Today’s LLMs can ace MMLU, ARC, and GSM8K... ...and still hallucinate, fumble reasoning, or break in production. The problem? We’ve built a system that rewards benchmarks, not reliability. Accuracy isn't enough. We need nuance. #AIbenchmarking #LLMfailures
1
0
0
Prompt fail states to watch for: → Agent never stops = no exit condition → Hallucinates data = no eval logic → Breaks JSON = no formatting constraint → Misses goal = prompt doesn’t define “done” Most bugs = prompt design bugs. #LLMFailures #PromptDebugging
1
0
0
A repo we're helping compile, work w/Ernest Davis @GaryMarcus @pascalefung @jahendler @witbrock @EvelinaLeivada @VeredShwartz @nasrinmmm
https://t.co/A13L2skfoj TY @mmikemma for this form to submit #LLMerrors #LLMfailures
https://t.co/8CxFJiAag7
#LLM #GPT #ChatGPT #NLP #ML #AI
0
3
6
Anyone who has used #ChatGPT and seen errors: please help the researchers maintaining this repository and the research community by reporting them using the form. Thanks! #chatbots #GPT #LLM #LLMs #LM #LLMerrors #LLMfailures #NLP #ML #MachineLearning #AI #ArtificialIntelligence
A repo we're helping compile, work w/Ernest Davis @GaryMarcus @pascalefung @jahendler @witbrock @EvelinaLeivada @VeredShwartz @nasrinmmm
https://t.co/A13L2skfoj TY @mmikemma for this form to submit #LLMerrors #LLMfailures
https://t.co/8CxFJiAag7
#LLM #GPT #ChatGPT #NLP #ML #AI
0
1
2