EmbeddingBugs

Many longstanding and emergent problems in software engineering are at their root a question of measuring correspondence between source code and natural language (e.g. query response and feature localization). It has been proposed that word embedding approaches will allow us to bridge this lexical gap, as word embeddings allow a representation of relative semantic relatedness in a language- agnostic space. This paper shows an attempt at replication of a well-cited publication addressing the problem of bug localization using word embeddings. We use a novel training dataset as our source for developing word embeddings but test on a common, standardized dataset. We provide insights on the process behind experiment replication, offering advice to those wishing to increase the replicability of their publications. We demonstrate the influence of choices in preprocessing steps, further highlighting the need for extensive experiment reporting as the field of software engineering continues to integrate machine learning tools.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
dataset		dataset
src		src
ProjectPaper.pdf		ProjectPaper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmbeddingBugs

About

Releases

Packages

Contributors 2

Languages

nirtiac/EmbeddingBugs

Folders and files

Latest commit

History

Repository files navigation

EmbeddingBugs

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages