Explore tweets tagged as #CodeMMLU
@AINativeF
AI Native Foundation
11 months
15. CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. 🔑 Keywords: Code Understanding, Code Generation, Code Analysis, LLMs, Software Development. 💡 Category: AI Systems and Tools. 🌟 Research Objective: To evaluate and enhance code
Tweet media one
1
0
1
@ceobillionaire
AGI.Eth
11 months
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. Manh et al.: #Artificialintelligence #DeepLearning #MachineLearning
Tweet media one
0
0
4
@Quebec_AI
Québec.AI
11 months
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. Manh et al.: #Artificialintelligence #DeepLearning #MachineLearning
Tweet media one
0
0
1
@rohanpaul_ai
Rohan Paul
11 months
CodeMMLU, a comprehensive multiple-choice question-answering benchmark for evaluating code understanding in LLMs. Covers 10,000+ questions across diverse domains, tasks, and programming languages. **Original Problem** 🔍:. Existing code benchmarks focus on open-ended generation
Tweet media one
4
13
82
@arXivGPT
arXivGPT
11 months
🏷️:CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. 🔗:
Tweet media one
0
0
1
@_nmd2k
Dung Nguyen Manh
11 months
Checkout CodeMMLU leaderboard at:
0
0
0
@vlruso
Vlad Ruso PhD
11 months
CodeMMLU: A Comprehensive Multi-Choice Benchmark for Assessing Code Understanding in Large Language Models. #CodeUnderstanding #AI #CodeLLMs #CodeMMLU #TechInnovation #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearn
Tweet media one
0
0
1
@PratyushLohumi
Pratyush Lohumi
11 months
[4/7] Evaluations show state-of-the-art models struggle with CodeMMLU, indicating gaps in understanding and emphasizing the link between understanding and generation. #AIChallenges #LLMs.
0
0
0
@ComputerPapers
Software Engineering
9 months
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs.
0
0
1
@PratyushLohumi
Pratyush Lohumi
11 months
[1/7] Introducing CodeMMLU: a benchmark designed to evaluate CodeLLMs' code understanding skills. This moves beyond code generation, highlighting the importance of comprehension. #AI #MachineLearning.
0
0
0
@PratyushLohumi
Pratyush Lohumi
11 months
[2/7] CodeMMLU features 10,000+ multiple-choice questions from diverse domains, testing code analysis, defect detection, and software engineering principles across languages. #CodeMMLU #SoftwareEngineering.
0
0
0
@ComputerPapers
Software Engineering
5 months
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding & Reasoning Capabilities of CodeLLMs.
0
1
0
@_nmd2k
Dung Nguyen Manh
11 months
To address shortcomings of recent code-related benchmarks, we introduce CodeMMLU, a novel benchmark designed to evaluate CodeLLMs' ability to understand and comprehend code through multi-choice question answering (MCQA).
@rohanpaul_ai
Rohan Paul
11 months
CodeMMLU, a comprehensive multiple-choice question-answering benchmark for evaluating code understanding in LLMs. Reveals limitations in SOTA models' code comprehension.------. Generated this podcast with Google's illuminate.
1
0
0
@QuocNghi91
Nghi Bui
7 months
Happy to share that our work CodeMMLU has been accepted to ICLR 2025 !!. @iclr_conf.
0
0
5
@PratyushLohumi
Pratyush Lohumi
11 months
[7/7] Discover more about CodeMMLU in their paper: #AcademicResearch #CodeLLM.
0
0
0
@bonybean
Bony Bean
11 months
CodeMMLU: A Comprehensive Multi-Choice Benchmark for Assessing Code Understanding in Large Language Models:
0
0
1
@PratyushLohumi
Pratyush Lohumi
11 months
[5/7] CodeMMLU aims to be a resource for advancing AI in software development, pushing for more reliable coding assistants. #AIforDev #FutureOfCoding.
1
0
1
@VivMeditator
TechVivian
11 months
@rohanpaul_ai Interesting findings on CodeMMLU! I think LLMs still have a way to go in truly understanding code. Great to see benchmarking efforts like this pushing the field forward!.
1
0
1
@genainewstop
GenAINews.co
11 months
Check out the groundbreaking research on CodeMMLU, a new benchmark designed to assess code understanding in Large Language Models. This tool aims to improve AI-assisted software development. #CodeUnderstanding #AIAssistedDevelopment.
0
0
0
@mazrnow
Mazhar Choudhry
11 months
@PratyushLohumi CodeMMLU sounds promising—how do you think reliable coding assistants will reshape startup success?.
0
0
0