Software Engineering
@ComputerPapers
Followers
2K
Following
1
Media
0
Statuses
51K
New Software Engineering submissions to https://t.co/vxDuY4St9Z (not affiliated with https://t.co/vxDuY4St9Z)
Worldwide
Joined April 2010
Does SWE-Bench-Verified Test Agent Ability or Model Memory?.
arxiv.org
SWE-Bench-Verified, a dataset comprising 500 issues, serves as a de facto benchmark for evaluating various large language models (LLMs) on their ability to resolve GitHub issues. But this...
0
0
0
ATLAS: Automated Toolkit for Large-Scale Verified Code Synthesis.
arxiv.org
Large language models have shown potential for program verification, but progress is hindered by the scarcity of verified code for training. We present ATLAS, an automated pipeline that...
0
0
0
Search-based Software Testing Driven by Domain Knowledge: Reflections and New Perspectives.
arxiv.org
Search-based Software Testing (SBST) can automatically generate test cases to search for requirements violations. Unlike manual test case development, it can generate a substantial number of test...
0
0
2
CREME: Robustness Enhancement of Code LLMs via Layer-Aware Model Editing.
arxiv.org
Large language models (LLMs) have demonstrated impressive capabilities in code generation, where the natural language prompt plays a crucial role in conveying user intent to the model. However,...
0
0
2
Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models.
arxiv.org
Fault localization (FL) is a critical step in debugging which typically relies on repeated executions to pinpoint faulty code regions. However, repeated executions can be impractical in the...
0
0
0
Directional Diffusion-Style Code Editing Pre-training.
arxiv.org
Code pre-trained models have shown promising effectiveness in various software engineering tasks. Among these tasks, many tasks are related to software evolution and/or code editing. However,...
0
0
0
Condor: A Code Discriminator Integrating General Semantics with Code Details.
arxiv.org
LLMs demonstrate significant potential across various software engineering tasks. However, they still face challenges in generating correct code on the first attempt when addressing complex...
0
0
1
AI-powered Code Review with LLMs: Early Results.
arxiv.org
In this paper, we present a novel approach to improving software quality and efficiency through a Large Language Model (LLM)-based model designed to review code and identify potential issues. Our...
0
1
4
CupCleaner: A Hybrid Data Cleaning Approach for Comment Updating.
arxiv.org
Comment updating is an emerging task in software evolution that aims to automatically revise source code comments in accordance with code changes. This task plays a vital role in maintaining...
0
0
0
Quantifying Uncertainty in Machine Learning-Based Pervasive Systems: Application to Human Activity Recognition.
arxiv.org
The recent convergence of pervasive computing and machine learning has given rise to numerous services, impacting almost all areas of economic and social activity. However, the use of AI...
0
0
2
Understanding Chain-of-Thought Effectiveness in Code Generation: An Empirical and Information-Theoretic Analysis.
arxiv.org
Large language models (LLMs) achieve strong performance on code generation, but the mechanisms by which Chain-of-Thought (CoT) prompting helps remain unclear. We present a systematic empirical and...
0
1
5
LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection.
arxiv.org
Effective log anomaly detection is critical to sustaining reliability in large-scale IT infrastructures. Transformer-based models require substantial resources and labeled data, exacerbating the...
0
0
0
Model management to support systems engineering workflows using ontology-based knowledge graphs.
arxiv.org
System engineering has been shifting from document-centric to model-based approaches, where assets are becoming more and more digital. Although digitisation conveys several benefits, it also...
0
0
1
SWEnergy: An Empirical Study on Energy Efficiency in Agentic Issue Resolution Frameworks with SLMs.
arxiv.org
Context. LLM-based autonomous agents in software engineering rely on large, proprietary models, limiting local deployment. This has spurred interest in Small Language Models (SLMs), but their...
0
0
1
Bug Priority Change Prediction: An Exploratory Study on Apache Software.
arxiv.org
Bug fixing is a critical activity in the software development process. In issue tracking systems such as JIRA, each bug report is assigned a priority level to indicate the urgency and importance...
0
0
0
TritonForge: Profiling-Guided Framework for Automated Triton Kernel Optimization.
arxiv.org
High-performance GPU kernel optimization remains a critical yet labor-intensive task in modern machine learning workloads. Although Triton, a domain-specific language for GPU programming, enables...
0
0
1
Evolving Excellence: Automated Optimization of LLM-based Agents.
arxiv.org
Agentic AI systems built on large language models (LLMs) offer significant potential for automating complex workflows, from software development to customer support. However, LLM agents often...
0
0
1
Llama-based source code vulnerability detection: Prompt engineering vs Fine tuning.
arxiv.org
The significant increase in software production, driven by the acceleration of development cycles over the past two decades, has led to a steady rise in software vulnerabilities, as shown by...
0
0
2
RESTifAI: LLM-Based Workflow for Reusable REST API Testing.
arxiv.org
With this paper, we introduce RESTifAI, an LLM-driven approach for generating reusable, CI/CD ready REST API tests, following the happy-path approach. Unlike existing tools that often focus...
0
0
2
Reusability in MLOps: Leveraging Ports and Adapters to Build a Microservices Architecture for the Maritime Domain.
arxiv.org
ML-Enabled Systems (MLES) are inherently complex since they require multiple components to achieve their business goal. This experience report showcases the software architecture reusability...
0
0
3