Composite Workflows, New Metadata format, Refactoring #173

Marti2203 · 2024-03-18T05:42:56Z

The following PR contains a large number of changes:

We introduce the concept of a composite workflow, which allows us to run multiple tools in succession
We modify the format of the metadata to incorporate better localization information and differentiation of test types
Add a lot of tools to the framework
Add an example benchmark, containing both C and Java subjects, that showcases the composite workflow

…o composite-workflows

…e-workflows

…this file

Signed-off-by: Martin Mirchev <[email protected]>

* allow image hash to be updated using config file * Update Infer driver * add ZAP driver * run config command if present * add instrument script * Format driver * set prebuilt image for aprcomp benchmarks * add slicing tool * add slicing dir * change path from command issued * add crashrepair localize tool * Add the slice type to the typing module * move examples benchmark to multi * update driver * update benchmark example * update localization stats * update crashrepair localizer driver * Update vulnloc reference * Resolved hard-coding issue in vulnloc benchmark * Update reference * Update EffFix,SAVER,FootPatch driver for the benchmark change * Fix dict access in abstract benchmark * Fix typo in EffFix driver * Update efffix benchmark * Fix more dict access in abstract tool * Fix several path issues in EffFix driver * add aixcc prompt (#175) * AIxCC framework using the fine-tuned DeepSeek model (#176) * add aixcc prompt * add sft deepseek model * Remove requirements for having tests in benchmark * Fix count_neg and count_pos * Remove efffix bench temporarily * Re-add efffix benchmark * Composite Workflows, New Metadata format, Refactoring (#173) * Dict is unhashable * Another typing fix and path * Fix some small things * Remove return statement * add instrumentation pass for AbstractLocalizeTool * Update Gzoltar output * Add identifier creation from GZoltar and fix a future bug * make ArjaE use localization metadata * fix ArjaE localizaition data; use shorter testing timeout * fix ArjaE localizaition data; use shorter testing timeout * add missing import in ArjaE driver * make TBar accept localization data * modify TBar driver * Fix a format error * Refactor Cerberus - remove duplicated code and add typings * Arguments not passed correctly in execute_wrapped * Fix some small bugs * Fix the TaskStatus map thing * Cleanup + timing fix * More small fixes * Small optimisation and a note * Small optimisation * Add debug flag to script * Fix a bug with the list_dir logic * Removing extra appended tag to reduce amount of images constructed * Ensure proper printing of the traceback * Remove mention of dev to make things more intuitive * update example benchmark * add test script to meta-data * add clean script * add api to clean subject in abstracttooldriver, use it in AFLdriver before instrumentation, rename example benchmark to crashing_tests * Add initial functionality for container cleanup, move around custom setup folder * update example * update localization for new config changes * Fix Mocktool * Ensure all repair tools generate a metadata-json file when done * update example * add more starting points for basic workflow * add workflow for examples benchmark * update output analysis * fix bug: update lines * use toolname to sepearate patches * Remove dot files * Fix a merge_dict bug * update output analysis for valkyrie * update example test oracle and remove duplicate test indexes * Fix time stats display bug * Fix a todo in TBar * update tool status check * prepare localization in expected format for workflow * fix bug in concatenation of multiple fix locs * change user to root in order to add groups * enable reference to stack trace key * use instrumentation for crashing programs for coverage * rename function name to division and update stack trace * fix indentation error * use restrict to lines in darjeeling for localization instead of files * Fix table sorting * update fuzzing stats to reflect generated passing/crashign tests * Ensure tool instance is fresh per step * Ensure against binding risks * remove passing test case from repair tools for basic workflow * set sudo information for Darjeeling * remove explicit setting user to root * Track timeouts as their own status * add java example * update java example * update java example * update dockerfile * add javalang * Fix a bug in the TBar driver * Update some of the Java drivers * Another small update for the java tools * Add output of the command objects * Defend against mistakenly double deleting and ensure that bug info is always a deepcopy * Add tool tag to the bug info * add method to load ast * update c crash example to include more program paths * add new workflow and rename identifiers * update setup script * place src dir in correct structure * fix bug in file name * fix bug in file name * update output analysis * add score for localization in java * allow fix localization to have additional properties * temporary fix: add location field in meta-data to support java repair tools * fix typo * add new java example * add new examples and remove old examples; organize workflows * add python subject to meta-data * fix typo * update output analysis for FauxPy * add python workflow and update java workflow to use a different example * separate meta-data generation * separate meta-data generation * darjeeling requires files in relative path * allow EvoRepair to take fix locations * update evorepair driver * fix source location * add drivers for llm based tools * add invoke_advance to create meta-data by default for LLM repair tools * add java llm workflow * support java and c differently * increase patch limit from 5 to 50 * Add score field to metadata * Allow CLI to override config file * Move the location hack to function * update vul4j * First iteration of tracking status codes and test count * Fix some overrides and add extra * Clarify where output comes from * Fix comment error * Add sanity check and test counting * Simple aggregation of errors for BasicWorkflow * Update submodules * First iteration of composite sequence subsequence * Revert the do_step generalization and fix code in the config * Add better documentation * fix bug; should take the max * improve output analysis * Collect summaries for subtools * Don't crash if no analysis output * Excluse trace construction on the composite tool * Replace execute command with makedirs * update localization driver to use dynamic instrumentation * Fix jazzer output * update driver with dynamic instrumentation * Ignore first line of tests.csv in gzoltar * Allow to control amount tests being passed around * Remove input left by debugging * Utilize the old bug info to allow for aggregation of analysis outputs, first step to log movement * Update java repair tools to have better dependency handling * Ensure that the ignore field must be a boolean * update output analysis for joernsbfl * update output analysis for joernsbfl * Hide some warnings * update driver for joernsbfl * Track paths * Fix a missing override * change timing of emit messages * Add jazzer harnesses for the java examples * Add example workflow for java with fuzzing * Update vul4j reference * Support file level localization * Add yaml support * Update benchmarks * Localization is a list * Ensure that identifiers are unique * Create tool directory * Fix GZoltar path * add workflow to switch the base image of the experiment image to start from subject iamge * update path to binaries * Add prompter driver * Add bugsinpy * Update java tool drivers * First steps toward locally running tools * Update some properties needed for local runs * Bash does not like double semi-colons * Locally running should append * Add tee to the abstract benchmark run * Update TBar (#189) * read api keys from config files * use one config file for all api keys * rename * keep types consistent * Update for local execution * Update some utilities * read openai token from config * add option to execute command as sudo * mkdir command should be run with sudo if default user is not root * create output in specific dir and copy later using save_artifacts * add context window, default to 10 * Update drivers & vul4j (#190) * Update leap year to work remotely * use https for bugsinpy Signed-off-by: Hzxin <[email protected]> * Integrate use subject as base * Fix dir_base_expr on the abstract tool * Move around some of the command for the proper local run of the basic workflow * Update text of clear script * Override artifacts directory and add more logging * Small update * Add an extra field * update prompter driver * update prompter driver * update prompter driver * Composites are always local * Check that the list is not empty * Fix some spacing * update prompter driver * update placeholder for gemini * default empty array * remove double quote * Fix naming error * update prompter argument to pass langauge * add reset command to valkyrie: default to git reset * get finer details of the validation results * emit more valdiation info * emit more valdiation info * copy output to expected dir * skip copy test if dir not exist * add new stats counting number of reports generated * Add bugjs benchmark (#192) * Add bugsphp benchmark (#194) * Add pyter support * update * Update .gitsubmodules * Update .gitsubmodules * Add bugsphp * Undo changes * Undo change * Add pyter driver & update bugsinpy driver (#191) * Add pyter support * Update benchmark & pyter tool * time stats should be generic to any tool, removing repair specific attributes * add tool config to individually configure rebuild * keep benign inputs for repair tools * increase max limit * Update llmr driver * update prompter driver * change to instrument * skip setting source_file * fix bug: copying core dump with root privileges cause errors, ignore this file * fix locs should be unique * update anlaysis * add san parser * update e9sbfl driver * update e9sbfl driver * handle exception when copying fails * update prompter driver to generate config * remove filtering, give all locations * update autobug driver * remove irrelevant code * create new auxiliary task type; iterative repair * create meta-data to continue flow * fix meta-data creation * change localization path relative to source path * append stack trace to fix locations * limit to top-5 stack locations * write meta-data * add support for azure openai api * use azure in prompter * parse api keys * randomize seed for fuzzing in crashrepair * Transfer patches * Update benchmarks --------- Signed-off-by: Hzxin <[email protected]> Signed-off-by: Martin Mirchev <[email protected]> Co-authored-by: crhf <[email protected]> Co-authored-by: Ridwan Shariffdeen <[email protected]> Co-authored-by: Hzxin <[email protected]> Co-authored-by: hzxin <[email protected]> --------- Signed-off-by: Hzxin <[email protected]> Signed-off-by: Martin Mirchev <[email protected]> Co-authored-by: Ridwan Shariffdeen <[email protected]> Co-authored-by: Martin Mirchev <[email protected]> Co-authored-by: Jiawei Wang <[email protected]> Co-authored-by: Zhang Yuntong <[email protected]> Co-authored-by: Nan Jiang <[email protected]> Co-authored-by: crhf <[email protected]> Co-authored-by: Hzxin <[email protected]> Co-authored-by: hzxin <[email protected]>

Marti2203 and others added 30 commits March 13, 2024 18:42

Dict is unhashable

4c29e65

Another typing fix and path

b2fa7f0

Fix some small things

a623637

Remove return statement

77df03d

add instrumentation pass for AbstractLocalizeTool

eb511e8

Update Gzoltar output

431b73d

Add identifier creation from GZoltar and fix a future bug

f2df211

make ArjaE use localization metadata

cd57ec1

fix ArjaE localizaition data; use shorter testing timeout

0f71f1f

fix ArjaE localizaition data; use shorter testing timeout

197d114

add missing import in ArjaE driver

c5e6133

make TBar accept localization data

09781db

modify TBar driver

3f68efb

Fix a format error

2bc7b2a

Merge branch 'composite-workflows' of github.com:nus-apr/cerberus int…

4bbddce

…o composite-workflows

Merge commit '6971ded21b79d45c77ef1403f4badd9c5ff0ae73' into composit…

d05e625

…e-workflows

Refactor Cerberus - remove duplicated code and add typings

061cdf2

Arguments not passed correctly in execute_wrapped

c594916

Fix some small bugs

4a76ed4

Fix the TaskStatus map thing

a9b6c3e

Cleanup + timing fix

14105d7

More small fixes

be36ad2

Small optimisation and a note

86f9e2a

Small optimisation

e51943a

Add debug flag to script

725d1a2

Fix a bug with the list_dir logic

eb21e8c

Removing extra appended tag to reduce amount of images constructed

9c466e2

Ensure proper printing of the traceback

da71fb2

Remove mention of dev to make things more intuitive

b0f5719

update example benchmark

268277a

rshariffdeen and others added 28 commits June 12, 2024 21:45

update prompter driver

a37c197

change to instrument

a60ebc5

skip setting source_file

e73cc52

fix bug: copying core dump with root privileges cause errors, ignore …

7c88f0b

…this file

fix locs should be unique

c5f1734

update anlaysis

33fbe7a

add san parser

dd511c8

update e9sbfl driver

701053c

update e9sbfl driver

b29fdfb

handle exception when copying fails

8bab015

update prompter driver to generate config

ac18281

remove filtering, give all locations

50ba469

update autobug driver

cbc6b94

remove irrelevant code

f8044a2

create new auxiliary task type; iterative repair

c6fcf6d

create meta-data to continue flow

b98e8fb

fix meta-data creation

8007b42

change localization path relative to source path

c3e66b7

append stack trace to fix locations

297eea7

limit to top-5 stack locations

83426e8

write meta-data

b100735

add support for azure openai api

8b55740

use azure in prompter

74d949c

parse api keys

864a942

randomize seed for fuzzing in crashrepair

490c354

Transfer patches

aee8520

Update benchmarks

2a1c366

Merge branch 'dev' into composite-workflows

d1923ba

Signed-off-by: Martin Mirchev <[email protected]>

Marti2203 merged commit 8422a90 into dev Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite Workflows, New Metadata format, Refactoring #173

Composite Workflows, New Metadata format, Refactoring #173

Marti2203 commented Mar 18, 2024 •

edited

Loading

Composite Workflows, New Metadata format, Refactoring #173

Composite Workflows, New Metadata format, Refactoring #173

Conversation

Marti2203 commented Mar 18, 2024 • edited Loading

Marti2203 commented Mar 18, 2024 •

edited

Loading