-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix existing xgboost examples #2830
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add questions
6ba432f
to
9f8101e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/build |
* support server side custom scripts (#2695) * update notebooks due to the simulator changes (#2696) * update notebooks due to the simulator changes the output now is located at ```<workspace>/server/simulated_job``` instead ```<workspace>/simulated_job``` * update notebooks due to the simulator changes the output now is located at ```<workspace>/server/simulated_job``` instead ```<workspace>/simulated_job``` * Fix DAM Unit Test (#2692) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Fixed dam_test.c error when no xgboost is installed * Fixed a format issue --------- Co-authored-by: Chester Chen <[email protected]> * Update version number MONAI and the bundle version (#2702) Co-authored-by: Chester Chen <[email protected]> * Add Hierarchical Stats example (#2694) * Update Hello Client Controlled Workflow(CCWF) README.md (#2709) The folder path in the command is incorrect. * Update stats READMEs (#2711) This changes adds federated hierarchical stats example link in `examples/advanced/README.md` and changes images size in `hierarchical_stats/README.md` as the images were appearing smaller in the web browser. Co-authored-by: Chester Chen <[email protected]> * Fix torch ddp (#2706) Co-authored-by: Chester Chen <[email protected]> * Cherry pick RM fix from #2667 (#2700) * Update ClientAlgo (#2566) (#2705) Co-authored-by: Chester Chen <[email protected]> * Fix ClientAPILauncherExecutor import path to remove torch dependency. (#2713) * Fix ClientAPILauncherExecutor import path to remove torch dependency. * Update Hello Client Controlled Workflow(CCWF) README.md (#2709) The folder path in the command is incorrect. * Update stats READMEs (#2711) This changes adds federated hierarchical stats example link in `examples/advanced/README.md` and changes images size in `hierarchical_stats/README.md` as the images were appearing smaller in the web browser. Co-authored-by: Chester Chen <[email protected]> * Fix torch ddp (#2706) Co-authored-by: Chester Chen <[email protected]> --------- Co-authored-by: tonywjs <[email protected]> Co-authored-by: Arun Patole <[email protected]> Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * set executor task (#2627) * Use ReliableMessage from 2.4 (#2717) * Enhance CLI command config (#2716) * Add CrossSiteEval with ModelController (#2699) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Enhance job auth setup script (#2715) * Merging XGBoost changes from 2.4 (#2712) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Merged XGB changes made in 2.4 to main * Fixed a format error * Undid change to histogram_based/executor.py * Addressed comments in PR --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * fix race condition handling (#2728) * Remove serialization of pfx (#2721) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * update readme link to website (#2734) * fix bcast manager min responses=0 (#2733) Co-authored-by: Chester Chen <[email protected]> * Fix cryptography encrypt error (#2732) * keep the local resources for simulator (#2730) * keep the local resources for simulator. * fixed the local folder deploy. --------- Co-authored-by: Chester Chen <[email protected]> * Support same app for all sites in Job API (#2714) * support same app to all * add to_server() and to_clients() routines * comment out export * improve input errors handling * check for missing server components * address comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Fix overseer test timing (#2743) * Add ModelController documentation (#2707) * add ModelController docs * address comments * address comments 2 * fix code block --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * [2.5] TIE (Technology for Integrating Everything) and Flower Inegration (#2523) * added TIE * add license text * fix fstr * support cli applet * add tli applet * develop flower integration * added license text * generate cli cmd by applet * integrate with flower * fix format * fix fl ctx * fix get_command * run hello-flwr-pt job (#7) * run hello-flwr-pt job * remove print outs * abort grpc gracefully * fix example * graceful shutdown of flower * fix msg release * fix formatting * fix formatting * fix formatting * check applet stop * update flwr server commands (#8) * test superlink ready before starting server app * improve log file handling * remove unused import * fixed _superlink_process var bug * change namespace for flower proto; log flower msgs to file and console * add license text * consolidate process mgr * improve docstrings * address pr review issues * address additional pr comments * changed to use flwr proto directly * use PyApplet for running py code * added PyApplet * support server app args; address pr issues * move ccreate_channel to grpc_utils * fix flower output formatting * reformat --------- Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add MetricsSender docstring (#2745) * Update MONAI example README (#2724) * fix clone to keep original (#2755) * Bump up the version of monai-nvflare package to 0.2.8 (#2749) Also update its nvflare version to ~=2.5.0rc1, monai to >=1.3.1 * Update getting_started.rst (#2737) * Update getting_started.rst * No need to mkdir With mkdir, the copied folder has structure simulator-example/hello-pt/jobs, while without mkdir, the copied folder has structure simulator-example/jobs * Update getting_started.rst * Add hello-pt to the folder structure --------- Co-authored-by: Sean Yang <[email protected]> * Add CIFAR 10 examples for Tensorflow-based FedAvg & FedOpt (#2704) * add alpha splitting * run experiments * add tensorboard writers; increase model size * fedopt version * add fedprox loss and callback * Update ModerateTFNet to match CIFAR10 torch implementation. * Fix multiprocessing GPU init error. Handle no alpha split case. * Add preprocessing to match torch CIFAR10 result. * Unify executor script for different algos. * Remove unused codes. * Add preprocessing steps to make TF results on par with torch examples. * Fix script executor args. * Add script to run all experiments. * Add README. * Fix graphs in README. * Modify TF FedOpt controller. * Update README and FedOpt result. * Remove duplicated flare init. * Fix result graph for centralized vs FedAvg. * Fix README re. alpha value for centralized training. * Improve README. * Add workspace arg. Change min_clients to num_clients. * Add warning on TF GPU vRAM allocation. * Clean up TB summary logs. * Remove FedProx which will be implemented in another PR. * Update notebook & README, re-add missing file. * Update license header. * Re-include missing script. * Remove change in torch example script. * Fix flake8, black and isort format issues. --------- Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Update setup_poc.ipynb (#2752) Add job templates arg to avoid "Unable to handle command: config due to: job_templates_dir='None', it is not a directory" error Use full name Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Added id to the jobAPI swarm_script_executor_cifar10 component deploy (#2678) * Added id to the swarm_script_executor_cifar10 component deploy. * codestyle fix. * Changed to use job.as_id(). * codestyle fix. * changed to use job.as_id(shareable_generator) for shareable_generator_id. * removed the un-necessary job.to() calls. --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Sean Yang <[email protected]> * XGBoost plugin with new API (#2725) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Implemented LocalPlugin * Refactoring plugin * Fixed formats * Fixed horizontal secure isses with mismatching algather-v sizes * Added padding to the buffer so it's big enough for histograms * Format fix * Changed log level for tenseal exceptions * Fixed a typo * Added debug statements * Fixed LocalPlugin horizontal bug * Added #include <chrono> * Added docstring to BasePlugin --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Moved the simulator server logger init earlier. (#2753) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * [2.4] Secure XGBoost Documentation (#2671) (#2759) * add 2.4.2 documentation * update plugin configuration section * address comments * address comments 2 * change default plugin to cuda_paillier --------- Co-authored-by: Chester Chen <[email protected]> * Getting started readmes (#2757) * add readmes * add note --------- Co-authored-by: Chester Chen <[email protected]> * fixed the CrossSiteEvalClientController in swarm_script_executor_cifar10 example. (#2762) * Cherry pick fixes from 2.4 (#2768) * Cherry pick launcher log fix (#2766) * Add flush=True to print in subprocess Output from print is usually buffered and may not appear on PIPE soon enough. * Replace logfile with the logging facility --------- Co-authored-by: Isaac Yang <[email protected]> * Update xgboost user guide (#2750) * Update xgboost user guide * add xgboost version --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Honor optional flag at streaming level (#2771) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added optional flag support in streaming layer * Removed the app_opt scan. (#2758) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add job API to support additional external dir in the custom dir (#2748) * Add job API to support additional external dir in the custom dir. * changed the behavior to copy external dir contents to job custom folder flat. --------- Co-authored-by: Sean Yang <[email protected]> * Moved the hello-pt example initialization to START_RUN, and store the downloaded dataset to each individual site. (#2735) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Fix config file name in doc (#2772) * Fix loading cli history in admin console (#2777) * Port 2.4 xgb changes (#2773) * Port 2.4 xgb changes * [2.4] Add xgboost metrics tracking cb (#2381) * add back changes * Fix unit test * implementation of Scaffold and FedProx for TensorFlow (#2727) * add alpha splitting * run experiments * add tensorboard writers; increase model size * fedopt version * add fedprox loss and callback * Update ModerateTFNet to match CIFAR10 torch implementation. * Fix multiprocessing GPU init error. Handle no alpha split case. * Add preprocessing to match torch CIFAR10 result. * Unify executor script for different algos. * Remove unused codes. * Add preprocessing steps to make TF results on par with torch examples. * Fix script executor args. * Add script to run all experiments. * Add README. * Fix graphs in README. * Modify TF FedOpt controller. * Update README and FedOpt result. * change the code in the NVFlare/examples/getting_started/tf for fedprox * change the code in the NVFlare/examples/getting_started/tf for fedprox * add the condition if fedprox_mu <0 NVFlare/examples/getting_started/tf for fedprox * Added Scaffold as an algorithm option * Providing a dedicated script for the usage of scaffold due to necessary code adjustments when using it * Helper for the usage of Scaffold as an algorithm * Main workflow for using scaffold * Changed path to scaffold workflow * Delete nvflare/app_opt/tf/scaffold_workflow.py * Update scaffold.py * Added clipnorm to the optimizer to handle empty tensors after long training of several epochs. * Update scaffold.py * Update cifar10_tf_fl_alpha_split_scaffold.py * Update scaffold.py * Create scaffold1.py * Update scaffold1.py * Update scaffold1.py * Update scaffold.py * Update scaffold1.py * Delete nvflare/app_opt/tf/scaffold1.py Not needed * Add files via upload * Added SCAFFOLD to the readme. * Update scaffold.py remove clipnorm from scaffold.py * Update cifar10_tf_fl_alpha_split_scaffold.py add clip_norm as an args to main function * Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. * Delete examples/getting_started/tf/figs/fedavg-diff-algos-new.png * Add files via upload * Delete examples/getting_started/tf/figs/fedavg-diff-algos-new.png * Add files via upload * changed accuracy values and plot * Update tf_fl_script_executor_cifar10.py Change the min_clients to num_clients based on the new changes * update doc * added support for TF 2.17 * Update scaffold.py to get just trainable layer names * delete the docs * add running scaffold job * Fix for models with non-trainable variables * Fixed the path of utils.py caused due to a typo * prepared scaffold for TF2.17 * prepared scaffold for TF2.17 * style changes due to failes tests * update the style * Update scaffold.py * update the style * update run_job.sh for scaffold and fedprox * update the style * revert the Style fix of unrelated files * revert the style changes * remove apidoc * remove apidoc --------- Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Zhijin Li <[email protected]> Co-authored-by: falibabaei <[email protected]> Co-authored-by: LeoDuda <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Chester Chen <[email protected]> Co-authored-by: LeoDuda <[email protected]> Co-authored-by: khadijeh.alibabaei <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * app_opt scan changes. (#2781) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Added error handling for XGB_CONFIGURED event (#2780) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added error handling in XGB_CONFIGURED event handler * Fixed formatting errors * Removed some redundant log entries * Addressed PR comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * fix for if torch and tensorflow are both installed (#2775) * Add FedJobAPI documentation (#2718) * add JobAPI docs * 2.5 misc doc updates * address comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> Co-authored-by: Chester Chen <[email protected]> * fixed the cross validation wrong config for swarm_script_executor_cifar10. (#2778) * Fixed the mgpu simulator workspace change error (#2770) * Fixed the mgpu simulator workspace change error. * codestyle fix. * Changed back the workspace.get_client_custom_dir(), fixed the sub_worker_process app_custom_folder. * Add the app_custom_folder in a proper way. * Update Secure XGBoost example w.r.t. XGBoost's code changes (#2686) * Initial commit for xgboost-secure * Initial commit for xgboost-secure * Change model output path * Change data mode * Most basic xgboost process for coding * Most basic xgboost process for coding * Most basic xgboost process for coding * Most basic xgboost process for coding * First prototype for secure vertical pipeline * Phase 1 concludes * add seal pipeline in C++ * experiment will more tree depth to ensure correct node behavior * experiment will more tree depth to ensure correct node behavior * update secureboost eval bench * set header to none for sample alignment * config processor interface from python * simplify data preparation, add horizontal testing codes * remove redundants * horizontal exps * update scripts * update test scripts * add feature tests * update to align all outputs' format * remove conflict * reorganize * format * add flare jobs * add readme and experiment results * update secure xgboost example to align with new xgboost branch * update secure xgboost example to align with new xgboost branch * add gpu scripts * modify split for gpu exp * modify split for gpu exp * refine readme with Yuanting's inputs * update gpu scripts * update gpu scripts * update gpu script * data preparation minor update * consolidate all testing scripts * update readme and standalone scripts * format update * format update * minor refinements * Improve the kill children processes (#2789) * use process.kill() to kill the children processes. * removed the sig argument. * removed no use import. * Add back metric callback and fix examples based on new xgboost version (#2787) * add docstring and cmd_data check (#2782) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add docstring to reliable message (#2788) * Pre-trained Model and training_mode changes (#2793) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added support for pre-trained model * Changed training_mode to split_mode + secure_training * split_mode => data_split_mode * Format error * Fixed a format error * Addressed PR comments * Fixed format * Changed all xgboost controller/executor to use new XGBoost --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Update xgboost example and ci (#2794) * [2.5] Update flower CLI (#2792) * update flower cli * update flwr hello-world job (#9) * update flwr hello-world job * add license header * update readme --------- Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * replace comet with tensorboard (#2798) * more app_opt scan example changes. (#2797) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add first version of release notes (#2800) * add first version of release notes * revise release notes * FIX hard-coded sp_end_point in POC (#2795) * Add hello examples with new APIs (#2785) * add hello examples with new APIs * move and reorganize hello-examples to keep old ones in CI * remove prepare data for hello tf * update wording * update dates * update wording * update wording * remove note * add information about dataset --------- Co-authored-by: Chester Chen <[email protected]> * Update autofedrl example (#2801) * update autofedrl example to make it run correctly * remove redundant import * Refactor XGBDataLoader (#2804) * Fix docstring typo (#2802) Co-authored-by: Chester Chen <[email protected]> * re-arrange getting started examples (#2805) * re-arrange getting started examples * re-arrange getting started examples * fix README.md --------- Co-authored-by: Sean Yang <[email protected]> * Update secure xgboost examples (#2803) * Update secure xgboost examples * Update readme --------- Co-authored-by: Chester Chen <[email protected]> * XGBoost user interface change and XGBoost version check (#2808) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Changed split_mode to data_split_mode and added version check * Fixed format errors * Added lock in ReliableMessage (#2811) * Fixed 2 PTFileModelLocator config errors. (#2807) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Update xgboost example (#2813) * Update xgboost example * Add feedback --------- Co-authored-by: Chester Chen <[email protected]> * Refactor Job API (#2799) * refactor fed job api * improve docstrings * refactor fed job api * improve docstrings * polish changes * fix bugs; check args * fix getattr * refactor * fixes and cleanup * added ccwf and flower jobs * added optional model selector for fed-avg * update hello-world examples * formatting and updates * address feedback * remove unnecessary stuff * add license text * detect duplicate executor --------- Co-authored-by: Yan Cheng <[email protected]> * Add CUDA plugin code (#2814) * Add CUDA plugin code * Remove test file * Use CGBN submodule and moved shared codes out --------- Co-authored-by: Chester Chen <[email protected]> * Fix jenkins CI (#2812) Co-authored-by: Chester Chen <[email protected]> * Remove the module class scan (#2790) * remove the module classes scan, only add limit number of classes to use name search. * rename variables. * Added popular PTFileModelPersistor and PTFileModelLocator in the class_tables. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Change all name to path (#2817) Co-authored-by: Sean Yang <[email protected]> * Add back hello-numpy-sag and update references (#2816) * add back hello-numpy-sag and update references * reformat notebook * Revert "Remove the module class scan (#2790)" (#2819) This reverts commit b96dc326c46305a076d5acd59a3f10fa0faf408c. * Fix config typos (#2818) * Relaxed grpcio/protobuf versions (#2822) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * ScriptExecutor improvements (#2820) * script executor improvements * move ScriptExecutor to job_config * rename ScriptExecutor to ScriptRunner, add TF versions of in process and ex process executors * fix dead links --------- Co-authored-by: Chester Chen <[email protected]> * fix job api examples (#2823) * Support ScriptRunner in ccwf_job (#2825) * support ScriptRunner in ccwf_job * remove unused import * added object type check * ScriptRunner framework option in examples (#2827) * use framework option in examples * rename files * Use pre module scan to create classes table (#2824) * pre-scan the module to create classes table. * reformat. * changed the module pre-scan result to json file. * removed the no use import. * Improved create_classes_table_static() error message logging. * Add one entry in MANIFEST.in (#2826) Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * add nvflare day banner, auto hide highlights (#2829) * Fix existing xgboost examples (#2830) * Remove unused code and update README (#2828) * Fixed the config changes error. (#2834) * Minor fixes to xgboost example (#2832) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> Co-authored-by: Ziyue Xu <[email protected]> * fix notebook errors (#2835) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Update requirements versions (#2831) * update requirements versions * update requirements versions --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * add NPModelPersistor to hello-fedavg-numpy (#2837) * improve the class_utils to handle the duplicate class name case (#2833) * improve the class_utils to handle the duplicate class name case. * changed error messages. --------- Co-authored-by: Chester Chen <[email protected]> * Add migration guide (#2806) * add migration guide * update * update * update --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * fix hello-pt, empty metrics (#2840) * Update ml-to-fl examples with new APIs (#2836) * update ml-to-fl examples with new apis * address comments * add export config option * rename to launch_process --------- Co-authored-by: Chester Chen <[email protected]> * Add example notebook for docker (#2767) * add example notebook for docker * remove unneeded file * update base image to NVIDIA PyTorch container * update --------- Co-authored-by: Chester Chen <[email protected]> * hello-pt-mlflow job api example (#2839) * WIP * hello-pt-mlflow job api example. * Extracted the BaseFedJob for FedAvgJob and SAGMLFlowJob. * refactoried. * reformat --------- Co-authored-by: Chester Chen <[email protected]> * Credit Card Fraud detection end-to-end with XGBoost (#2738) * wip end-to-end examples for enrich, process and xgboost restore changed file add readme.md update readme.md update readme.md update readme.md update readme.md update readme.md update code Update xgb notebook, readme, requirements.txt, as well the new version XGBoost, data loader style/import license headers restore XGBDataLoader 1. Anonymize the BIC code and bank names 2. update the changes to split_mode and secure_training_mode * address PR comments * fix the code due to the XGBoost and Job API changes * 1) clean up output 2) remove unused import --------- Co-authored-by: Ziyue Xu <[email protected]> * rolled back the job api custom_file copy destination change. (#2848) * remove basename script conversion in ScriptRunner (#2849) * Update site code blocks and links (#2847) * update site code blocks and links * rename executor to runner * fix dtype error (#2852) * Convert step-by-step stats examples to use new Job API (#2842) * 1. add getting start notebook 2. convert df_stats from job template to job API * 1) add StatsJob to simplify the user experience 2) update both higgs and cifar10 stats using the new StatsJob to streamline the notebooks * format/style * update based on comments * update based on comments * switch the id prefix to real id * format style * Convert tree-based Fed XGBoost with Job API (#2843) * Convert XGBoost to Job API * format style * update based on comments * clean up * rename the executor to runner * Convert Scikit-Learn examples (SVM, Kmeans, Linear) to use Job API (#2845) * convert scikit-learn (Linear, SVM and Kmeans) to use Job API * format style * remove duplicate file * update based on comments * rename executor to runner * fix typo * tweaks * Added Debug in ReliableMessage and Ignore XGB errors after shutdown (#2851) * Added debug_info in ReliableMessage and ignore error after XGB shutdown * Removed redundant code * Added return in _handle_error --------- Co-authored-by: Chester Chen <[email protected]> * Update arg name for MLflowReceiver (#2850) Co-authored-by: Chester Chen <[email protected]> * Update step by step examples to use Job API (#2841) * Update client api to use same task as CSE and update step-by-step CSE (#2844) * Update client api to use same task as CSE and update step-by-step CSE * Update based on latest changes * Update swarm script runner * use runner * Autofedrl fix for updated locator behavior (#2856) * update model locator behavior * remove unnecessary changes * Convert CCWF examples to use Job API (#2846) * convert cyclic_ccwf swarm sbs and hello_ccwf examples * update based on job api change * change to numpy * uncomment export job * fix ci --------- Co-authored-by: Chester Chen <[email protected]> * Fixed the SubprocessLauncher missing app_custom_folder in the PythonPath. (#2857) * Add FLModel parameter checks (#2859) * clarify default persistor_id (#2861) * Added check for duplicate RM request (#2858) * Added check for duplicate RM request * Addressed PR comment * Add support of just doing metrics streaming with client api (#2763) * Add support of just doing metrics streaming with client api * Address review comments * Add flower metrics streaming example (#2764) * Add flower metrics streaing example * Fix format * Use context and RecordSet * Undo stuff * Update to new style * Update hello-flwr-pt_tb_streaming * Remove debug msgs * Update readme * Use flower job * Add missing code * Make client api type an arg * add docs for Flower integration (#2862) * update simulator folder path (#2865) * Removed the len() call which causes training failures (#2863) * Fed Stats Notebooks and Read ME: fix fed stats output directory due to simulator output structure changes (#2864) * fix fed stats output directory due to simulator output structure changes * cleanup output * rollback xgboost version changes * Fixed the wrong dh_psi_task_handler path. (#2866) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * improve race condition handling (#2867) * improve race condition handling * changed to use warning for late reply --------- Co-authored-by: Zhihong Zhang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * support passing custom env vars for flower client (#2870) * fix cross-validation path (#2869) * Update Job API docs after redesign (#2873) * update job api documentation * change path to object * Updated xgboost user guide (#2872) * Updated xgboost user guide * Pinged the xgboost releaas to a specific version --------- Co-authored-by: Chester Chen <[email protected]> * Add pipe docstring (#2868) Co-authored-by: Chester Chen <[email protected]> * Update flower examples (#2871) * Clean up getting started installation docs (#2874) * clean up getting started installation docs * fix links and clean up top of page * reorganize getting started to primarily be in examples getting_stared README and update quickstart to contain installation * update README and notebook * Make the Launcher extends FLComponent. (#2875) * fix docs (#2877) * Fix heartbeat timeout config (#2878) * fix heartbeat timeout config * use TaskExchanger variable * Added more handling for the source file import handling. (#2876) Co-authored-by: Chester Chen <[email protected]> * Update the generated component classes table (#2879) * Update the generated component classes table. * Added back the MLflowReceiver. --------- Co-authored-by: Chester Chen <[email protected]> * Fix for last index of module path (#2881) * Update the generated component classes table. * Added back the MLflowReceiver. * Change to match the last module_path from source_file. --------- Co-authored-by: Chester Chen <[email protected]> * Fix hierarchical stats documentation (#2882) This patch fixes few typos in the hierarchical stats documentation and fixes the prepare_data python script. * update the tb path in fedbn example (#2883) * fix path due to simulator output structure changes (#2885) * Add note on installing nvflare in requirements (#2884) * add note on installing nvflare in requirements * fix typo --------- Co-authored-by: Chester Chen <[email protected]> * fix sbs notebooks (#2887) * Re-factor hello-numpy-cse example (#2880) * Update CrossSiteEval (#2886) * Update CrossSiteEval * Update base class * Undo no-need change --------- Co-authored-by: Chester Chen <[email protected]> * Add printing of tb logdir (#2888) Co-authored-by: Chester Chen <[email protected]> * Update getting_started cifar notebook (#2889) * add TB streaming to notebook * remove unnecessary changes * Deprecate decorator pattern (#2891) * Deprecate client api decorator pattern * Update doc * Added instructions to run horizontal secure XGBoost in simulator (#2890) * Added simulator instructions for tenseal context * Fixed reference * Fixing target * Renamed provisioning target to xgb_provisioning --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Updated plugin build doc (#2892) * fix PSI and Vertical learning paths (#2893) * fix fed stats output directory due to simulator output structure changes * cleanup output * rollback xgboost version changes * fix path issues * restore some of old values * Fix ci test configs format issue (#2896) * remove bionemo from new (#2897) * update random forest and vertical xgb examples (#2895) * site, docs, example updates (#2894) * Update xgboost requirements (#2898) Co-authored-by: Ziyue Xu <[email protected]> * Update flare simulator tutorial (#2899) * use correct tf model weights filename (#2901) * Add log info for flower executor (#2900) * Add log info for flower executor * Fix format * Fix hello-pt-cse job (#2905) * Undo remove bionemo from new (#2902) * undo remove bionemo from new * update --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add vertical xgboost gpu instructions (#2903) * Add vertical xgboost gpu instructions * Update xgb gpu --------- Co-authored-by: Ziyue Xu <[email protected]> * Fix bionemo examples (#2904) * run task fitting * update sys info * fix SCL data split, use 1 gpu for ESM2 fine-tuning * restore run scripts * Fixed README format (#2906) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * update xgboost doc (#2907) * Added debug info for memoryview error (#2908) * Added debug info for memoryview errors * Fixed formatting issues * Change job simulator run to use Popen (#2909) * Changed the job API simulator_run to use Popen. * reformat. * Remove the no use import. --------- Co-authored-by: Chester Chen <[email protected]> * fix hello_world tf result printing (#2910) * Fixed XGBoost Example README (#2913) * Changed split_mode to data_split_mode * Fixed a merging error * change to num_clients (#2914) * Fix data save path (#2917) * trim the whitespace of the clients and gpu from the job simulator_run (#2912) * trim the whitespace of the clients and gpu from the job simulator_run. * Added unit test. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add CSE with job api with client api (#2918) * change getting started examples to use BaseFedJob (#2919) * Added warning for mixed plugin use (#2920) * BugFix: Hierarchical Fed Stats, prepare data: replace os.rename() function (#2921) * replace os.rename() to shutil.move() to avoid os.error when destination and src are in different mnt or devices * remove redundant move * remove unused import * Note about Simulator in XGBoost Doc (#2911) * Added a note about simulator not supporting resources.json * Rephrased the sentence --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * add params_transfer_type to ScriptRunner (#2922) * Fix nemo examples (#2923) * fix prompt_learning * fix peft * Added the current-round info the fl_ctx for BaseModelController (#2916) * Added the current-round info the fl_ctx for BaseModelController. * reformat. * codestyle fix. * Moved the self.set_fl_context(data) call to broadcast_model(). * Change broadcast_model() to must send a FLModel, not None. * Changed the BaseModelController broadcast_model data default value, and a warning message to debug. * refactoried. * Updated docstring. --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Fix ci path (#2927) * Fix path * Fix path * Remove invalid validator --------- Co-authored-by: Sean Yang <[email protected]> * Fix xgb standalone fed (#2924) Co-authored-by: Chester Chen <[email protected]> * Fixing the memoryview issues (#2926) * Added handling for buffer overun * Added task_lock to read() and ignore duplicate chunks * Simplifed the wait loop * Fixed a formatting error * Check EOS when appending data --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Fixing memoryview error (#2929) * Fixed dup seq 0 bug * Formatting errors * update higgs data link (#2941) * update video links (#2937) * fix typo (#2939) * Add research examples to tutorial page (#2942) * add research examples to tutorial page * remove banner --------- Co-authored-by: Chester Chen <[email protected]> * Fix doc and docstring issues (#2931) * Fix doc and docstring issues * Address comments --------- Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Chester Chen <[email protected]> * Add check for receive before send in client api (#2930) * Add flare series section, enhancements (#2948) * improvements, add series section * address comments * Bump tqdm from 4.66.1 to 4.66.3 in /research/condist-fl (#2557) Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.3. - [Release notes](https://github.com/tqdm/tqdm/releases) - [Commits](https://github.com/tqdm/tqdm/compare/v4.66.1...v4.66.3) --- updated-dependencies: - dependency-name: tqdm dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump micromatch from 4.0.5 to 4.0.8 in /web (#2838) Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8. - [Release notes](https://github.com/micromatch/micromatch/releases) - [Changelog](https://github.com/micromatch/micromatch/blob/4.0.8/CHANGELOG.md) - [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8) --- updated-dependencies: - dependency-name: micromatch dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump dset from 3.1.3 to 3.1.4 in /web (#2936) Bumps [dset](https://github.com/lukeed/dset) from 3.1.3 to 3.1.4. - [Release notes](https://github.com/lukeed/dset/releases) - [Commits](https://github.com/lukeed/dset/compare/v3.1.3...v3.1.4) --- updated-dependencies: - dependency-name: dset dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add support to newer python version (#2951) * Upgrade formatter version for support higher version of Python (#2957) * Upgrade formatter version for support higher version of Python * Fix formatting issues * Fix unit tests * disable not working tests * add research links redirects (#2953) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Fix a bug in dashboard that server local resource file was not generated (#2964) correctly Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * improved fobs register_folder to catch ValueError. (#2958) * Add fedrag example with embedding training (#2915) * Add fedrag example with embedding training * fix link and format * fix link and format * fix link and format * keep rag folder structure, remove the retrieveal placeholder * keep rag folder structure, remove the retrieveal placeholder * remove template job preparation * remove template job preparation * update JobAPI script * update eval bash * update eval bash and result --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Bump rollup from 3.29.4 to 3.29.5 in /web (#2963) Bumps [rollup](https://github.com/rollup/rollup) from 3.29.4 to 3.29.5. - [Release notes](https://github.com/rollup/rollup/releases) - [Changelog](https://github.com/rollup/rollup/blob/master/CHANGELOG.md) - [Commits](https://github.com/rollup/rollup/compare/v3.29.4...v3.29.5) --- updated-dependencies: - dependency-name: rollup dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump path-to-regexp from 6.2.2 to 6.3.0 in /web (#2938) Bumps [path-to-regexp](https://github.com/pillarjs/path-to-regexp) from 6.2.2 to 6.3.0. - [Release notes](https://github.com/pillarjs/path-to-regexp/releases) - [Changelog](https://github.com/pillarjs/path-to-regexp/blob/master/History.md) - [Commits](https://github.com/pillarjs/path-to-regexp/compare/v6.2.2...v6.3.0) --- updated-dependencies: - dependency-name: path-to-regexp dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Bump vite from 4.5.3 to 4.5.5 in /web (#2950) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.3 to 4.5.5. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.5/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.5/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Add the hello-pt-resnet example (#2954) * Add the hello-pt-resnet example. * Removed the no use SimpleNetwork. * codestyle fix for hello-pt-resnet example. * renamed the simple_network.py -> resnet_18.py. And the resnet18 link to ReadMe. * updated license year. * codestyle fix. * black codestyle fix. * codestyle fix. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Update CONTRIBUTING.md (#2969) * update PSI to support python 3.11 (#2972) * update PSI requirements.txt to support openmind-psi==2.0.4 which support python 3.11 * add comments * add web versioning (#2974) Co-authored-by: Chester Chen <[email protected]> * [Main] Support object reuse (#2975) * support object reuse * fix formatting --------- Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * update openmind-psi to 2.0.5 for python 12 support (#2981) * Replace the distutils with shutil. (#2978) * Allow multiple workflows in CCWF (#2980) * support object reuse * fix formatting * allow multiple workflows in ccwf * allow multiple workflows in ccwf --------- Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * F3 Streaming Code Rewrite (#2960) * Ported F3 streaming rewrite code to main * Moved code reference class variables to RXTask class * Rollback changes to byte_streamer.py --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Pass components into script runner (#2983) * Fix tf model persistor and tf model (#2984) * add missing filter id arg in tf model persistor * Update TFModel * Address comment * Allow customization of BaseFedJob (#2985) * Add CommonComponentsJob * Fix format * Address comments * Fix issue * add umami analytics (#2987) * Update pt params converter (#2989) * update pt params converter * use exclude_vars * print warning * add return value * Bionemo demos (#2968) * updated bionemo demos to v1.8 * cleaned demos outputs for clarity * added linces and fixed naming README * fixed license headers and readme hyperlink * black fixing code * isort and flake8 fixes * addressing PR changes * removed unrequired infer copy file * updated other runs files/configs and fixed path in downstream notebook * fixed fedavg max_epochs setting to 1, removed extra data in taps yamls, fixed column used for each site * changed fedavg* and local* yamls to have site specific data for tap. for sabdab fedavg changed to original ??? * also sabdab local changed dataset.train config to ??? * tap fix configurations * update nb, nvflare version * use strict false * use full data for central training of sabdab --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Holger Roth <[email protected]> * Add FLARE DAY page (#2992) * add flare day page * add slides * move link location * add web speaker (#2999) * Fix doc typo and VDR reported issues (#2994) * BioNeMo: use multi threading but reduce num workers (#2996) * use multi threading but reduce num workers * revert nbs * update links * Update integration test script and upgrade tenseal and psi version (#2995) * Update documentation for Dockerfile, add location of tbevents, fix link (#2993) * update documentation for Dockerfile, add location of tbevents, and fix link * add comment for Dockerfile to explain difference * fix the entry for getting started in the TOC (#3007) * Expost init in client lightning api (#3004) * enhance web responsive design for mobile (#3010) * Redid the branch to cleanup the commits (#2986) * Update flwr job object, client, server (#3008) Co-authored-by: Sean Yang <[email protected]> * Bump cookie, @astrojs/mdx and astro in /web (#3002) Bumps [cookie](https://github.com/jshttp/cookie) to 0.7.2 and updates ancestor dependencies [cookie](https://github.com/jshttp/cookie), [@astrojs/mdx](https://github.com/withastro/astro/tree/HEAD/packages/integrations/mdx) and [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro). These dependencies need to be updated together. Updates `cookie` from 0.5.0 to 0.7.2 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.5.0...v0.7.2) Updates `@astrojs/mdx` from 1.1.5 to 3.1.7 - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/main/packages/integrations/mdx/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/@astrojs/[email protected]/packages/integrations/mdx) Updates `astro` from 3.6.5 to 4.15.12 - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/main/packages/astro/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/[email protected]/packages/astro) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: "@astrojs/mdx" dependency-type: direct:production - dependency-name: astro dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add GNN encoder and xgb outputs for finance end-to-end example (#2970) * Readme notebook polish and cleanup * Reorganize folder structure and initial gnn * Complete the graph generate step with edgemap output * Format fix * Format fix * Add graph construction and training notebooks * Add full gnn functionality * Update wording for readme --------- Co-authored-by: Chester Chen <[email protected]> * Fix fobs issue (#3011) * Fix fobs doc (#3012) * Remove the need to create additinal ports when running a job. (#3017) * Fixed broken doc ref to 'helm_chart' (#3022) * add none default values (#3025) * FedBPT: Fix fedbpt cma version (#3029) * fix cma version; upgrade nvflare version * upgrade python to 3.12 * upgrade openmined-psi version (#3020) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Enhance POC notebook and docs (#3031) * 2.5 vdr enhancements * add table --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> * Support multiple host names for FLARE server (#3018) * support multiple host names for fl server * add connect_to check * fix server side overseer agent * add server identity to fed_client.json * fix format --------- Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Isaac Yang <[email protected]> * multi line table (#3034) --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Holger Roth <[email protected]> Co-authored-by: Chester Chen <[email protected]> Co-authored-by: Zhihong Zhang <[email protected]> Co-authored-by: nvkevlu <[email protected]> Co-authored-by: Arun Patole <[email protected]> Co-authored-by: tonywjs <[email protected]> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <[email protected]> Co-authored-by: Zhijin <[email protected]> Co-authored-by: Sean Yang <[email protected]> Co-authored-by: Isaac Yang <[email protected]> Co-authored-by: Yuhong Wen <[email protected]> Co-authored-by: Hao-Wei Pang <[email protected]> Co-authored-by: Holger Roth <[email protected]> Co-authored-by: falibabaei <[email protected]> Co-authored-by: falibabaei <[email protected]> Co-authored-by: LeoDuda <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: LeoDuda <[email protected]> Co-authored-by: khadijeh.alibabaei <[email protected]> Co-authored-by: Ziyue Xu <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Naev <[email protected]> Co-authored-by: Alessandro Giusa <[email protected]>
Description
Types of changes
./runtest.sh
.