The ReaLMistake benchmark, which assesses LLMs' abilities to detect various types of errors in NLP tasks. #ErrorDetection #Benchmarking #LargeLanguageModels
1
0
0