Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite Workflows, New Metadata format, Refactoring #173

Merged
merged 552 commits into from
Sep 21, 2024

Conversation

Marti2203
Copy link
Collaborator

@Marti2203 Marti2203 commented Mar 18, 2024

The following PR contains a large number of changes:

  • We introduce the concept of a composite workflow, which allows us to run multiple tools in succession
  • We modify the format of the metadata to incorporate better localization information and differentiation of test types
  • Add a lot of tools to the framework
  • Add an example benchmark, containing both C and Java subjects, that showcases the composite workflow

@Marti2203 Marti2203 merged commit 8422a90 into dev Sep 21, 2024
Marti2203 added a commit that referenced this pull request Sep 21, 2024
* allow image hash to be updated using config file

* Update Infer driver

* add ZAP driver

* run config command if present

* add instrument script

* Format driver

* set prebuilt image for aprcomp benchmarks

* add slicing tool

* add slicing dir

* change path from command issued

* add crashrepair localize tool

* Add the slice type to the typing module

* move examples benchmark to multi

* update driver

* update benchmark example

* update localization stats

* update crashrepair localizer driver

* Update vulnloc reference

* Resolved hard-coding issue in vulnloc benchmark

* Update reference

* Update EffFix,SAVER,FootPatch driver for the benchmark change

* Fix dict access in abstract benchmark

* Fix typo in EffFix driver

* Update efffix benchmark

* Fix more dict access in abstract tool

* Fix several path issues in EffFix driver

* add aixcc prompt (#175)

* AIxCC framework using the fine-tuned DeepSeek model (#176)

* add aixcc prompt

* add sft deepseek model

* Remove requirements for having tests in benchmark

* Fix count_neg and count_pos

* Remove efffix bench temporarily

* Re-add efffix benchmark

* Composite Workflows, New Metadata format, Refactoring (#173)

* Dict is unhashable

* Another typing fix and path

* Fix some small things

* Remove return statement

* add instrumentation pass for AbstractLocalizeTool

* Update Gzoltar output

* Add identifier creation from GZoltar and fix a future bug

* make ArjaE use localization metadata

* fix ArjaE localizaition data; use shorter testing timeout

* fix ArjaE localizaition data; use shorter testing timeout

* add missing import in ArjaE driver

* make TBar accept localization data

* modify TBar driver

* Fix a format error

* Refactor Cerberus - remove duplicated code and add typings

* Arguments not passed correctly in execute_wrapped

* Fix some small bugs

* Fix the TaskStatus map thing

* Cleanup + timing fix

* More small fixes

* Small optimisation and a note

* Small optimisation

* Add debug flag to script

* Fix a bug with the list_dir logic

* Removing extra appended tag to reduce amount of images constructed

* Ensure proper printing of the traceback

* Remove mention of dev to make things more intuitive

* update example benchmark

* add test script to meta-data

* add clean script

* add api to clean subject in abstracttooldriver, use it in AFLdriver before instrumentation, rename example benchmark to crashing_tests

* Add initial functionality for container cleanup, move around custom setup folder

* update example

* update localization for new config changes

* Fix Mocktool

* Ensure all repair tools generate a metadata-json file when done

* update example

* add more starting points for basic workflow

* add workflow for examples benchmark

* update output analysis

* fix bug: update lines

* use toolname to sepearate patches

* Remove dot files

* Fix a merge_dict bug

* update output analysis for valkyrie

* update example test oracle and remove duplicate test indexes

* Fix time stats display bug

* Fix a todo in TBar

* update tool status check

* prepare localization in expected format for workflow

* fix bug in concatenation of multiple fix locs

* change user to root in order to add groups

* enable reference to stack trace key

* use instrumentation for crashing programs for coverage

* rename function name to division and update stack trace

* fix indentation error

* use restrict to lines in darjeeling for localization instead of files

* Fix table sorting

* update fuzzing stats to reflect generated passing/crashign tests

* Ensure tool instance is fresh per step

* Ensure against binding risks

* remove passing test case from repair tools for basic workflow

* set sudo information for Darjeeling

* remove explicit setting user to root

* Track timeouts as their own status

* add java example

* update java example

* update java example

* update dockerfile

* add javalang

* Fix a bug in the TBar driver

* Update some of the Java drivers

* Another small update for the java tools

* Add output of the command objects

* Defend against mistakenly double deleting and ensure that bug info is always a deepcopy

* Add tool tag to the bug info

* add method to load ast

* update c crash example to include more program paths

* add new workflow and rename identifiers

* update setup script

* place src dir in correct structure

* fix bug in file name

* fix bug in file name

* update output analysis

* add score for localization in java

* allow fix localization to have additional properties

* temporary fix: add location field in meta-data to support java repair tools

* fix typo

* add new java example

* add new examples and remove old examples; organize workflows

* add python subject to meta-data

* fix typo

* update output analysis for FauxPy

* add python workflow and update java workflow to use a different example

* separate meta-data generation

* separate meta-data generation

* darjeeling requires files in relative path

* allow EvoRepair to take fix locations

* update evorepair driver

* fix source location

* add drivers for llm based tools

* add invoke_advance to create meta-data by default for LLM repair tools

* add java llm workflow

* support java and c differently

* increase patch limit from 5 to 50

* Add score field to metadata

* Allow CLI to override config file

* Move the location hack to function

* update vul4j

* First iteration of tracking status codes and test count

* Fix some overrides and add extra

* Clarify where output comes from

* Fix comment error

* Add sanity check and test counting

* Simple aggregation  of errors for BasicWorkflow

* Update submodules

* First iteration of composite sequence subsequence

* Revert the do_step generalization and fix code in the config

* Add better documentation

* fix bug; should take the max

* improve output analysis

* Collect summaries for subtools

* Don't crash if no analysis output

* Excluse trace construction on the composite tool

* Replace execute command with makedirs

* update localization driver to use dynamic instrumentation

* Fix jazzer output

* update driver with dynamic instrumentation

* Ignore first line of tests.csv in gzoltar

* Allow to control amount tests being passed around

* Remove input left by debugging

* Utilize the old bug info to allow for aggregation of analysis outputs, first step to log movement

* Update java repair tools to have better dependency handling

* Ensure that the ignore field must be a boolean

* update output analysis for joernsbfl

* update output analysis for joernsbfl

* Hide some warnings

* update driver for joernsbfl

* Track paths

* Fix a missing override

* change timing of emit messages

* Add jazzer  harnesses for the java examples

* Add example workflow for java with fuzzing

* Update vul4j reference

* Support file level localization

* Add yaml support

* Update benchmarks

* Localization is a list

* Ensure that identifiers are unique

* Create tool directory

* Fix GZoltar path

* add workflow to switch the base image of the experiment image to start from subject iamge

* update path to binaries

* Add prompter driver

* Add bugsinpy

* Update java tool drivers

* First steps toward locally running tools

* Update some properties needed for local runs

* Bash does not like double semi-colons

* Locally running should append

* Add tee to the abstract benchmark run

* Update TBar (#189)

* read api keys from config files

* use one config file for all api keys

* rename

* keep types consistent

* Update for local execution

* Update some utilities

* read openai token from config

* add option to execute command as sudo

* mkdir command should be run with sudo if default user is not root

* create output in specific dir and copy later using save_artifacts

* add context window, default to 10

* Update drivers & vul4j (#190)

* Update leap year to work remotely

* use https for bugsinpy

Signed-off-by: Hzxin <[email protected]>

* Integrate use subject as base

* Fix dir_base_expr on the abstract tool

* Move around some of the command for the proper local run of the basic workflow

* Update text of clear script

* Override artifacts directory and add more logging

* Small update

* Add an extra field

* update prompter driver

* update prompter driver

* update prompter driver

* Composites are always local

* Check that the list is not empty

* Fix some spacing

* update prompter driver

* update placeholder for gemini

* default empty array

* remove double quote

* Fix naming error

* update prompter argument to pass langauge

* add reset command to valkyrie: default to git reset

* get finer details of the validation results

* emit more valdiation info

* emit more valdiation info

* copy output to expected dir

* skip copy test if dir not exist

* add new stats counting number of reports generated

* Add bugjs benchmark (#192)

* Add bugsphp benchmark (#194)

* Add pyter support

* update

* Update .gitsubmodules

* Update .gitsubmodules

* Add bugsphp

* Undo changes

* Undo change

* Add pyter driver & update bugsinpy driver (#191)

* Add pyter support

* Update benchmark & pyter tool

* time stats should be generic to any tool, removing repair specific attributes

* add tool config to individually configure rebuild

* keep benign inputs for repair tools

* increase max limit

* Update llmr driver

* update prompter driver

* change to instrument

* skip setting source_file

* fix bug: copying core dump with root privileges cause errors, ignore this file

* fix locs should be unique

* update anlaysis

* add san parser

* update e9sbfl driver

* update e9sbfl driver

* handle exception when copying fails

* update prompter driver to generate config

* remove filtering, give all locations

* update autobug driver

* remove irrelevant code

* create new auxiliary task type; iterative repair

* create meta-data to continue flow

* fix meta-data creation

* change localization path relative to source path

* append stack trace to fix locations

* limit to top-5 stack locations

* write meta-data

* add support for azure openai api

* use azure in prompter

* parse api keys

* randomize seed for fuzzing in crashrepair

* Transfer patches

* Update benchmarks

---------

Signed-off-by: Hzxin <[email protected]>
Signed-off-by: Martin Mirchev <[email protected]>
Co-authored-by: crhf <[email protected]>
Co-authored-by: Ridwan Shariffdeen <[email protected]>
Co-authored-by: Hzxin <[email protected]>
Co-authored-by: hzxin <[email protected]>

---------

Signed-off-by: Hzxin <[email protected]>
Signed-off-by: Martin Mirchev <[email protected]>
Co-authored-by: Ridwan Shariffdeen <[email protected]>
Co-authored-by: Martin Mirchev <[email protected]>
Co-authored-by: Jiawei Wang <[email protected]>
Co-authored-by: Zhang Yuntong <[email protected]>
Co-authored-by: Nan Jiang <[email protected]>
Co-authored-by: crhf <[email protected]>
Co-authored-by: Hzxin <[email protected]>
Co-authored-by: hzxin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants