Number of papers: 14
- Authors: Sch"{a}fer, Max and Nadi, Sarah and Eghbali, Aryaz and Tip, Frank
- Abstract: Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. Large Language Models (LLMs) have recently been applied to various aspects of software development, including their suggested use for automated generation of unit tests, but while requiring additional training or few-shot learning on examples of existing tests. This paper presents a large-scale empirical evaluation on the effectiveness...
- Link: Read Paper
- Labels: program testing, unit testing, empirical study
- Authors: Zhang, Yuxia and Qiu, Zhiqing and Stol, Klaas-Jan and Zhu, Wenhui and Zhu, Jiaxin and Tian, Yingchen and Liu, Hui
- Abstract: Commit messages are critical for code comprehension and software maintenance. Writing a high-quality message requires skill and effort. To support developers and reduce their effort on this task, several approaches have been proposed to automatically generate commit messages. Despite the promising performance reported, we have identified three significant and prevalent threats in these automated approaches: 1) the datasets used to train and evaluate these approaches contain a considerable amount...
- Link: Read Paper
- Labels: software maintenance and deployment, commit message generation, empirical study
- Authors: Jiang, Shuai and Fu, Cai and He, Shuai and Lv, Jianqiang and Han, Lansheng and Hu, Hong
- Abstract: Binary Code Similarity Detection (BCSD) is a fundamental binary analysis technique in the area of software security. Recently, advanced deep learning algorithms are integrated into BCSD platforms to achieve superior performance on well-known benchmarks. However, real-world large programs embed more complex diversities due to different compilers, various optimization levels, multiple architectures and even obfuscations. Existing BCSD solutions suffer from low accuracy issues in such complicated r...
- Link: Read Paper
- Labels: static analysis, code similarity analysis, code model, code model training, binary code model
- Authors: Yang, Guang and Zhou, Yu and Chen, Xiang and Zhang, Xiangyu and Zhuo, Terry Yue and Chen, Taolue
- Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in code generation. The integration of Chain of Thought (CoT) reasoning can further boost their performance. However, current CoT methods often require manual writing or LLMs with over 100 billion parameters to generate, impeding their applicability in resource-constrained scenarios. In this study, we investigate lightweight Language Models (<inline-formula><tex-math notation="LaTeX">$ell$</tex-math><alterna...
- Link: Read Paper
- Labels: code generation, program synthesis, empirical study
- Authors: Tang, Yutian and Liu, Zhijie and Zhou, Zhichao and Luo, Xiapu
- Abstract: Recent advancements in large language models (LLMs) have demonstrated exceptional success in a wide range of general domain tasks, such as question answering and following instructions. Moreover, LLMs have shown potential in various software engineering applications. In this study, we present a systematic comparison of test suites generated by the ChatGPT LLM and the state-of-the-art SBST tool EvoSuite. Our comparison is based on several critical factors, including correctness, readability, code...
- Link: Read Paper
- Labels: program testing, unit testing, empirical study
- Authors: Tufano, Rosalia and Dabi'{c}, Ozren and Mastropaolo, Antonio and Ciniselli, Matteo and Bavota, Gabriele
- Abstract: The automation of code review has been tackled by several researchers with the goal of reducing its cost. The adoption of deep learning in software engineering pushed the automation to new boundaries, with techniques <italic>imitating</italic> developers in generative tasks, such as commenting on a code change as a reviewer would do or addressing a reviewer's comment by modifying code. The performance of these techniques is usually assessed through quantitative metrics, <italic>...
- Link: Read Paper
- Labels: code review, empirical study
- Authors: Kang, Sungmin and Yoon, Juyeon and Askarbekkyzy, Nargiz and Yoo, Shin
- Abstract: Bug reproduction is a critical developer activity that is also challenging to automate, as bug reports are often in natural language and thus can be difficult to transform to test cases consistently. As a result, existing techniques mostly focused on crash bugs, which are easier to automatically detect and verify. In this work, we overcome this limitation by using large language models (LLMs), which have been demonstrated to be adept at natural language processing and code generation. By prompti...
- Link: Read Paper
- Labels: program testing, bug reproduction, empirical study
- Authors: Paltenghi, Matteo and Pandita, Rahul and Henley, Austin Z. and Ziegler, Albert
- Abstract: Recent neural models of code, such as OpenAI Codex and AlphaCode, have demonstrated remarkable proficiency at code generation due to the underlying attention mechanism. However, it often remains unclear how the models actually process code, and to what extent their reasoning and the way their attention mechanism scans the code matches the patterns of developers. A poor understanding of the model reasoning process limits the way in which current neural models are leveraged today, so far mostly fo...
- Link: Read Paper
- Labels: code generation, code completion, code model, code model training, source code model, benchmark
- Authors: Tu, Haoxin and Zhou, Zhide and Jiang, He and Yusuf, Imam Nur Bani and Li, Yuxian and Jiang, Lingxiao
- Abstract: Compiler bugs pose a significant threat to safety-critical applications, and promptly as well as effectively isolating these bugs is crucial for assuring the quality of compilers. However, the limited availability of debugging information on reported bugs complicates the compiler bug isolation task. Existing compiler bug isolation approaches convert the problem into a test program mutation problem, but they are still limited by ineffective mutation strategies or high human effort requirements. D...
- Link: Read Paper
- Labels: program testing, debugging, code model, code model training, source code model
- Authors: Fakhoury, Sarah and Naik, Aaditya and Sakkas, Georgios and Chakraborty, Saikat and Lahiri, Shuvendu K.
- Abstract: Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent. However, given NL is informal, it does not lend easily to checking that the generated code correctly satisfies the user intent. In this paper, we propose a novel interactive workflow <sc>TiCoder</sc> for guided intent clarification (i.e., partial formalization) through tests to support the generation of more accurate...
- Link: Read Paper
- Labels: code generation, program synthesis, empirical study
- Authors: Zhou, Ziyi and Li, Mingchen and Yu, Huiqun and Fan, Guisheng and Yang, Penghui and Huang, Zijie
- Abstract: Code summarization aims to automatically generate natural language descriptions for code, and has become a rapidly expanding research area in the past decades. Unfortunately, existing approaches mainly focus on the “one-to-one” mapping from methods to short descriptions, which hinders them from becoming practical tools: 1) The program context is ignored, so they have difficulty in predicting labels outside the target method; 2) They are typically trained to generate brief function descriptions w...
- Link: Read Paper
- Labels: static analysis, code summarization, benchmark
- Authors: Liu, Zhijie and Tang, Yutian and Luo, Xiapu and Zhou, Yuming and Zhang, Liang Feng
- Abstract: Large language models (LLMs) have demonstrated impressive capabilities across various natural language processing (NLP) tasks, such as machine translation, question answering, summarization, and so on. Additionally, LLMs are also highly valuable in supporting software engineering tasks, particularly in the field of code generation. Automatic code generation is a process of automatically generating source code or executable code based on given specifications or requirements, improving developer p...
- Link: Read Paper
- Labels: code generation, program synthesis, empirical study
- Authors: Wang, Junjie and Huang, Yuchao and Chen, Chunyang and Liu, Zhe and Wang, Song and Wang, Qing
- Abstract: Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance across a wide range of tasks. Meanwhile, software testing is a crucial undertaking that serves as a cornerstone for ensuring the quality and reliability of software products. As the scope and complexity of software systems continue to grow, the need for more effect...
- Link: Read Paper
- Labels: program testing, survey
Towards Efficient Fine-Tuning of Language Models With Organizational Data for Automated Software Review
- Authors: Nashaat, Mona and Miller, James
- Abstract: Large language models like BERT and GPT possess significant capabilities and potential impacts across various applications. Software engineers often use these models for code-related tasks, including generating, debugging, and summarizing code. Nevertheless, large language models still have several flaws, including model hallucination. (e.g., generating erroneous code and producing outdated and inaccurate programs) and the substantial computational resources and energy required for training and ...
- Link: Read Paper
- Labels: code generation, program repair, code model, code model training, source code model