Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instead of saving 1K issues per file, save issues with ID within a thousand per file #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

VladyslavBondarenko
Copy link

OpenSZZ stores issues in .csv files with maximum 1000 issues in each. This way, a file <project_key>_0.csv has issues with ID from 1 to 999, file <project_key>_1.csv has issues with ID from 1000 to 1999, and so on.
To fetch certain portion of issues from Jira API, OpenSZZ uses the next parameters in JQL query:
– project=<project_key> ORDER BY key ASC
– tempMax=1000
– pager/start=<page_number*1000>
With <project_key> = OOZIE and <page_number> = 2 the query is interpreted as follows: from all issues of OOZIE project sorted by issue key in ascending order return 1000 issues starting from 2000th result.
On the step of linking commits to issues, OpenSZZ extracts issue IDs from commit messages. Then OpenSZZ searches the issues with IDs equal to the extracted ones not in all files with fetched issues, but only in files that are supposed to contain them. Therefore, OpenSZZ will search an issue OOZIE-2222 in OOZIE_2.csv.

The process works correctly as long as issues are not deleted in the Jira project. When some issues are deleted from a Jira project, it is possible that some other issues will not be found by OpenSZZ because they are stored in another file and not in the file where they are supposed to be. For example, if any issue with the ID
between 1000 and 2000 is deleted, then the query used to return 1000 results after 1000 results returns issues with IDs from 1000 to 2001. Thus, the issue with the ID 2000 will be stored in the file <project_key>_1.csv and will not be found in <project_key>_2.csv. Hence, even if a commit that references the issue with ID 2000 is a bug-fixing commit, it will not be considered bug-fixing because issues linked in the commit message will not be found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant