Explore tweets tagged as #LLMfailures
@layerlens_ai
LayerLens
2 months
Today’s LLMs can ace MMLU, ARC, and GSM8K... ...and still hallucinate, fumble reasoning, or break in production. The problem? We’ve built a system that rewards benchmarks, not reliability. Accuracy isn't enough. We need nuance. #AIbenchmarking #LLMfailures
1
0
0
@zeroxaitales
Solysian ZeroX AI MediaTales
3 months
Prompt fail states to watch for: → Agent never stops = no exit condition → Hallucinates data = no eval logic → Breaks JSON = no formatting constraint → Misses goal = prompt doesn’t define “done” Most bugs = prompt design bugs. #LLMFailures #PromptDebugging
1
0
0
@banazir
William H. Hsu @ IJCAI 2025
3 years
0
3
6
@DiverseInAI
Diversity in AI @ IJCAI 2025
3 years
Anyone who has used #ChatGPT and seen errors: please help the researchers maintaining this repository and the research community by reporting them using the form. Thanks! #chatbots #GPT #LLM #LLMs #LM #LLMerrors #LLMfailures #NLP #ML #MachineLearning #AI #ArtificialIntelligence
0
1
2