ACL2023

Number of papers: 13

A Static Evaluation of Code Completion by Large Language Models

Authors: Ding, Hantian and Kumar, Varun and Tian, Yuchen and Wang, Zijian and Kwiatkowski, Rob and Li, Xiaopeng and Ramanathan, Murali Krishna and Ray, Baishakhi and Bhatia, Parminder and Sengupta, Sudipta
Abstract: Large language models trained on code have shown great potential to increase productivity of software developers. Several execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. Nevertheless, it is expensive to perform the same evaluation on complex real-world projects considering the execution cost. On the other hand, static analysis tools such as linters, which can detect errors without running the program, haven’...
Link: Read Paper
Labels: code generation, code completion, empirical study

Code4Struct: Code Generation for Few-Shot Event Structure Prediction

Authors: Wang, Xingyao and Li, Sha and Ji, Heng
Abstract: Large Language Model (LLM) trained on a mixture of text and code has demonstrated impressive capability in translating natural language (NL) into structured code. We observe that semantic structures can be conveniently translated into code and propose Code4Struct to leverage such text-to-structure translation capability to tackle structured prediction tasks. As a case study, we formulate Event Argument Extraction (EAE) as converting text into event-argument structures that can be represented as ...
Link: Read Paper
Labels: code generation, program synthesis, empirical study

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

Authors: Li, Peng and Sun, Tianxiang and Tang, Qiong and Yan, Hang and Wu, Yuanbin and Huang, Xuanjing and Qiu, Xipeng
Abstract: Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it. However, it is nontrivial to perform information extraction (IE) tasks with NL-LLMs since the output of the IE task is usually structured and therefore is hard to be converted into plain text. In this pa...
Link: Read Paper
Labels: code generation, program synthesis

Complementary explanations for effective in-context learning

Authors: Ye, Xi and Iyer, Srinivasan and Celikyilmaz, Asli and Stoyanov, Ves and Durrett, Greg and Pasunuru, Ramakanth
Abstract: Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to better understand the mechanisms by which explanations are used for in-context learning. We first study the impact of two different factors on the performance of prompts with explanations: the computation trace (the way the solution is decomposed) and the natur...
Link: Read Paper
Labels: prompt strategy, reason with code, empirical study

Domain-specific transformer models for query translation

Authors: Kulkarni, Mandar and Garera, Nikesh and Trivedi, Anusua
Abstract: Due to the democratization of e-commerce, many product companies are listing their goods for online shopping. For periodic buying within a domain such as Grocery, consumers are generally inclined to buy certain brands of products. Due to a large non-English speaking population in India, we observe a significant percentage of code-mix Hinglish search queries e.g., sasta atta. An intuitive approach to dealing with code-mix queries is to train an encoder-decoder model to translate the query to Engl...
Link: Read Paper
Labels: code generation, program transformation

Fact-Checking Complex Claims with Program-Guided Reasoning

Authors: Pan, Liangming and Wu, Xiaobao and Lu, Xinyuan and Luu, Anh Tuan and Wang, William Yang and Kan, Min-Yen and Nakov, Preslav
Abstract: Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning. In this paper, we present Program-Guided Fact-Checking (ProgramFC), a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions. We first leverage the in-context learning ability of large language models to generate reasoning programs to guide the verification process. Afterward, w...
Link: Read Paper
Labels: prompt strategy, reason with code

Large Language Models Meet NL2Code: A Survey

Authors: Zan, Daoguang and Chen, Bei and Zhang, Fengji and Lu, Dianjie and Wu, Bingchao and Guan, Bei and Yongji, Wang and Lou, Jian-Guang
Abstract: The task of generating code from a natural language description, or NL2Code, is considered a pressing and significant challenge in code intelligence. Thanks to the rapid development of pre-training techniques, surging large language models are being proposed for code, sparking the advances in NL2Code. To facilitate further research and applications in this field, in this paper, we present a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metr...
Link: Read Paper
Labels: survey, code generation, program synthesis

Making Language Models Better Reasoners with Step-Aware Verifier

Authors: Li, Yifei and Lin, Zeqi and Zhang, Shizhuo and Fu, Qiang and Chen, Bei and Lou, Jian-Guang and Chen, Weizhu
Abstract: Few-shot learning is a challenging task that requires language models to generalize from limited examples. Large language models like GPT-3 and PaLM have made impressive progress in this area, but they still face difficulties in reasoning tasks such as GSM8K, a benchmark for arithmetic problems. To improve their reasoning skills, previous work has proposed to guide the language model with prompts that elicit a series of reasoning steps before giving the final answer, achieving a significant impr...
Link: Read Paper
Labels: prompt strategy, sampling and ranking

Multitask Pretraining with Structured Knowledge for Text-to-SQL Generation

Authors: Giaquinto, Robert and Zhang, Dejiao and Kleiner, Benjamin and Li, Yang and Tan, Ming and Bhatia, Parminder and Nallapati, Ramesh and Ma, Xiaofei
Abstract: Many machine learning-based low-code or no-code applications involve generating code that interacts with structured knowledge. For example, one of the most studied tasks in this area is generating SQL code from a natural language statement. Prior work shows that incorporating context information from the database schema, such as table and column names, is beneficial to model performance on this task. In this work we present a large pretraining dataset and strategy for learning representations of...
Link: Read Paper
Labels: code generation, program synthesis, benchmark

Python Code Generation by Asking Clarification Questions

Authors: Li, Haau-Sing (Xiaocheng) and Mesgar, Mohsen and Martins, André and Gurevych, Iryna
Abstract: Code generation from text requires understanding the user’s intent from a natural languagedescription and generating an executable code snippet that satisfies this intent. While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can...
Link: Read Paper
Labels: code generation, program synthesis

ReCode: Robustness Evaluation of Code Generation Models

Authors: Wang, Shiqi and Li, Zheng and Qian, Haifeng and Yang, Chenghao and Wang, Zijian and Shang, Mingyue and Kumar, Varun and Tan, Samson and Ray, Baishakhi and Bhatia, Parminder and Nallapati, Ramesh and Ramanathan, Murali Krishna and Roth, Dan and Xiang, Bing
Abstract: Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works on robustness in text or code tasks have focused on classification, while robustness in generation tasks is an uncharted area and to date there is no comprehensive benchmark for robustness in ...
Link: Read Paper
Labels: code model, code model training, source code model, code model, code model robustness

Self-Edit: Fault-Aware Code Editor for Code Generation

Authors: Zhang, Kechi and Li, Zhuo and Li, Jia and Li, Ge and Jin, Zhi
Abstract: Large language models (LLMs) have demonstrated an impressive ability to generate codes on competitive programming tasks. However, with limited sample numbers, LLMs still suffer from poor accuracy. Inspired by the process of human programming, we propose a generate-and-edit approach named Self-Edit that utilizes execution results of the generated code from LLMs to improve the code quality on the competitive programming task. We execute the generated code on the example test case provided in the q...
Link: Read Paper
Labels: code generation, program repair, benchmark

XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations

Authors: Zhang, Yusen and Wang, Jun and Wang, Zhiguo and Zhang, Rui
Abstract: Cross-Lingual Semantic Parsing (CLSP) aims to translate queries in multiple natural languages (NLs) into meaning representations (MRs) such as SQL, lambda calculus, and logic forms. However, existing CLSP models are separately proposed and evaluated on datasets of limited tasks and applications, impeding a comprehensive and unified evaluation of CLSP on a diverse range of NLs and MRs. To this end, we present XSemPLR, a unified benchmark for cross-lingual semantic parsing featured with 22 natural...
Link: Read Paper
Labels: code model, code model training, source code model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ACL2023

A Static Evaluation of Code Completion by Large Language Models

Code4Struct: Code Generation for Few-Shot Event Structure Prediction

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

Complementary explanations for effective in-context learning

Domain-specific transformer models for query translation

Fact-Checking Complex Claims with Program-Guided Reasoning

Large Language Models Meet NL2Code: A Survey

Making Language Models Better Reasoners with Step-Aware Verifier

Multitask Pretraining with Structured Knowledge for Text-to-SQL Generation

Python Code Generation by Asking Clarification Questions

ReCode: Robustness Evaluation of Code Generation Models

Self-Edit: Fault-Aware Code Editor for Code Generation

XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations

Files

README.md

Latest commit

History

README.md

File metadata and controls

ACL2023