-
Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?, (ICSE2024)
- Abstract: This paper discusses the limitations of evaluating Masked Language Models (MLMs) in code completion tasks. We highlight that relying on accuracy-based measurements may lead to an overestimation of models' capabilities by neglecting the syntax rules of programming languages. To address these issues, we introduce a technique called SyntaxEval in which Syntactic Capabilities are used to enhance the evaluation of MLMs. SyntaxEval automates the process of masking elements in the model input based on ...
- Labels: static analysis, syntactic analysis, empirical study