#CodeMMLU X Hashtag | Muskviewer

Explore tweets tagged as #CodeMMLU

AI Native Foundation

@AINativeF

11 months

15. CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. 🔑 Keywords: Code Understanding, Code Generation, Code Analysis, LLMs, Software Development. 💡 Category: AI Systems and Tools. 🌟 Research Objective: To evaluate and enhance code

1

0

1

AGI.Eth

@ceobillionaire

11 months

CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. Manh et al.: #Artificialintelligence #DeepLearning #MachineLearning

0

4

Québec.AI

@Quebec_AI

11 months

CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. Manh et al.: #Artificialintelligence #DeepLearning #MachineLearning

0

1

Rohan Paul

@rohanpaul_ai

11 months

CodeMMLU, a comprehensive multiple-choice question-answering benchmark for evaluating code understanding in LLMs. Covers 10,000+ questions across diverse domains, tasks, and programming languages. **Original Problem** 🔍:. Existing code benchmarks focus on open-ended generation

4

13

82

arXivGPT

@arXivGPT

11 months

🏷️:CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs. 🔗:

0

1

Dung Nguyen Manh

@_nmd2k

11 months

Checkout CodeMMLU leaderboard at:

0

Vlad Ruso PhD

@vlruso

11 months

CodeMMLU: A Comprehensive Multi-Choice Benchmark for Assessing Code Understanding in Large Language Models. #CodeUnderstanding #AI #CodeLLMs #CodeMMLU #TechInnovation #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearn…

0

1

Pratyush Lohumi

@PratyushLohumi

11 months

[4/7] Evaluations show state-of-the-art models struggle with CodeMMLU, indicating gaps in understanding and emphasizing the link between understanding and generation. #AIChallenges #LLMs.

0

Software Engineering

@ComputerPapers

9 months

CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs.

0

1

Pratyush Lohumi

@PratyushLohumi

11 months

[1/7] Introducing CodeMMLU: a benchmark designed to evaluate CodeLLMs' code understanding skills. This moves beyond code generation, highlighting the importance of comprehension. #AI #MachineLearning.

0

Pratyush Lohumi

@PratyushLohumi

11 months

[2/7] CodeMMLU features 10,000+ multiple-choice questions from diverse domains, testing code analysis, defect detection, and software engineering principles across languages. #CodeMMLU #SoftwareEngineering.

0

Software Engineering

@ComputerPapers

5 months

CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding & Reasoning Capabilities of CodeLLMs.

0

1

0

Dung Nguyen Manh

@_nmd2k

11 months

To address shortcomings of recent code-related benchmarks, we introduce CodeMMLU, a novel benchmark designed to evaluate CodeLLMs' ability to understand and comprehend code through multi-choice question answering (MCQA).

Rohan Paul

@rohanpaul_ai

11 months

CodeMMLU, a comprehensive multiple-choice question-answering benchmark for evaluating code understanding in LLMs. Reveals limitations in SOTA models' code comprehension.------. Generated this podcast with Google's illuminate.

1

0

Nghi Bui

@QuocNghi91

7 months

Happy to share that our work CodeMMLU has been accepted to ICLR 2025 !!. @iclr_conf.

0

5

Pratyush Lohumi

@PratyushLohumi

11 months

[7/7] Discover more about CodeMMLU in their paper: #AcademicResearch #CodeLLM.

0

Bony Bean

@bonybean

11 months

CodeMMLU: A Comprehensive Multi-Choice Benchmark for Assessing Code Understanding in Large Language Models:

0

1

Pratyush Lohumi

@PratyushLohumi

11 months

[5/7] CodeMMLU aims to be a resource for advancing AI in software development, pushing for more reliable coding assistants. #AIforDev #FutureOfCoding.

1

0

1

TechVivian

@VivMeditator

11 months

@rohanpaul_ai Interesting findings on CodeMMLU! I think LLMs still have a way to go in truly understanding code. Great to see benchmarking efforts like this pushing the field forward!.

1

0

1

GenAINews.co

@genainewstop

11 months

Check out the groundbreaking research on CodeMMLU, a new benchmark designed to assess code understanding in Large Language Models. This tool aims to improve AI-assisted software development. #CodeUnderstanding #AIAssistedDevelopment.

0

Mazhar Choudhry

@mazrnow

11 months

@PratyushLohumi CodeMMLU sounds promising—how do you think reliable coding assistants will reshape startup success?.

0