Skip to content

Conversation

@victorli2002
Copy link
Collaborator

@victorli2002 victorli2002 commented Jan 15, 2025

Merging the substitution-builder branch

  • adding top-level function to fetch all substitutions

Related Issues:

Linked Issues:

Issues closed by this PR:

  • Closes #

This change is Reviewable

@victorli2002 victorli2002 force-pushed the merge-substitution-builder branch from a0efceb to 3e4c357 Compare January 17, 2025 23:18
Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch coverage isn't great, that probably should get improved before merging (or at least investigated)

Reviewed 62 of 64 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @lockshaw and @victorli2002)


lib/substitutions/include/substitutions/operator_pattern/operator_attribute_constraint.h line 13 at r2 (raw file):

                                               OperatorAttributeValue const &);
OperatorAttributeConstraint op_attr_key_divisible_by(OperatorAttributeKey,
                                                     int denominator);

Suggestion:

                                                     nonnegative_int denominator);

lib/substitutions/src/substitutions/sub_parallel_computation_graph.cc line 191 at r2 (raw file):

std::string as_dot(SubParallelComputationGraph const &spcg) {
  NOT_IMPLEMENTED();

TODO


lib/substitutions/include/substitutions/tensor_pattern/tensor_attribute_pattern.h line 9 at r2 (raw file):

TensorAttributePattern tensor_attribute_pattern_match_all();
TensorAttributePattern tensor_attr_pattern_require_num_dims(int num_dims);

Suggestion:

TensorAttributePattern tensor_attr_pattern_require_num_dims(nonnegative_int num_dims);

lib/utils/include/utils/containers.h line 0 at r2 (raw file):
Delete this file


lib/substitutions/src/substitutions/substitution_builder.cc line 110 at r2 (raw file):

                            "OutputGraphExprValue holding a "
                            "OutputGraphExprNodeOutput, but received {}",
                            maybe_pattern_output));

Suggestion:

maybe_output_graph_expr_output

lib/substitutions/src/substitutions/substitution_builder.cc line 115 at r2 (raw file):

  assert(!this->output_mapping.contains_l(pattern_output));
  assert(!this->output_mapping.contains_r(output_graph_expr_output));

Throw an exception with an actual error message rather than using assert

Code quote:

  assert(!this->output_mapping.contains_l(pattern_output));
  assert(!this->output_mapping.contains_r(output_graph_expr_output));

lib/utils/src/utils/graph/open_dataflow_graph/algorithms/view_inputs_as_nodes.cc line 5 at r2 (raw file):

namespace FlexFlow {

ViewInputsAsNodes::ViewInputsAsNodes(

Is ViewInputsAsNodes actually used anywhere? If not it should be removed


lib/substitutions/src/substitutions/unity_substitution_set.cc line 14 at r2 (raw file):

  std::vector<Substitution> substitutions;
  for (int num_dims : range(MAX_TENSOR_DIM)) {
    for (int degree = 1; degree <= resources.num_nodes; degree *= 2) {

Suggestion:

    for (int degree = 1; degree <= get_num_gpus(resources); degree *= 2) {

lib/utils/src/utils/graph/dataflow_graph/algorithms/as_dot.cc line 0 at r2 (raw file):
What happened to the old dot rendering code? Does it need to be deleted?


lib/utils/include/utils/render_dot.h line 1 at r2 (raw file):

#ifndef _FLEXFLOW_LIB_UTILS_INCLUDE_UTILS_RENDER_DOT_H

This should probably be in utils/graph


lib/utils/include/utils/containers/replicate.h line 11 at r2 (raw file):

template <typename T>
std::vector<T> replicate(int n, T const &element) {

Suggestion:

std::vector<T> replicate(nonnegative_int n, T const &element) {

lib/utils/include/utils/containers/replicate.h line 14 at r2 (raw file):

  if (n < 0) {
    throw mk_runtime_error(
        fmt::format("replicate expected n > 0, but received n = {}", n));

Suggestion:

        fmt::format("replicate expected n >= 0, but received n = {}", n));

lib/substitutions/src/substitutions/substitution.cc line 32 at r2 (raw file):

  };

  auto l_from_r_output_attrs_assignment =

This code needs to get cleaned up


lib/utils/include/utils/containers/merge_maps.h line 18 at r2 (raw file):

    merge_maps(std::unordered_map<K, V> const &lhs,
               std::unordered_map<K, V> const &rhs,
               MergeMethod merge_method = MergeMethod::REQUIRE_DISJOINT) {

Does this actually need to be dynamically choosable, or can we just create three functions (merge_disjoint_maps, merge_maps_left_dominates, merge_maps_right_dominates)?

Code quote:

               MergeMethod merge_method = MergeMethod::REQUIRE_DISJOINT

lib/substitutions/test/src/substitutions/substitution.cc line 81 at r2 (raw file):

  }

  // TEST_CASE("is_valid_substitution") {

TODO

Copy link
Collaborator Author

@victorli2002 victorli2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @lockshaw)


lib/substitutions/src/substitutions/sub_parallel_computation_graph.cc line 191 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

TODO

Can you take this?


lib/substitutions/src/substitutions/substitution.cc line 32 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This code needs to get cleaned up

Can you take this?


lib/utils/include/utils/containers/merge_maps.h line 18 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Does this actually need to be dynamically choosable, or can we just create three functions (merge_disjoint_maps, merge_maps_left_dominates, merge_maps_right_dominates)?

I think it should be fine to create three separate functions. Will do that.


lib/utils/src/utils/graph/dataflow_graph/algorithms/as_dot.cc line at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What happened to the old dot rendering code? Does it need to be deleted?

If I'm not mistaken then this is the old code?
flexflow-train/lib/utils/src/utils/graph/dataflow_graph/algorithms/as_dot.cc at master · flexflow/flexflow-train

The version from substitution-builder seemed more robust and nicer to me.


lib/utils/src/utils/graph/open_dataflow_graph/algorithms/view_inputs_as_nodes.cc line 5 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Is ViewInputsAsNodes actually used anywhere? If not it should be removed

It's not used anywhere atm. Will remove

@lockshaw
Copy link
Collaborator

lib/substitutions/src/substitutions/substitution.cc line 32 at r2 (raw file):

Previously, victorli2002 (Victor Li) wrote…

Can you take this?

Sure

@lockshaw
Copy link
Collaborator

lib/substitutions/src/substitutions/sub_parallel_computation_graph.cc line 191 at r2 (raw file):

Previously, victorli2002 (Victor Li) wrote…

Can you take this?

Sure

Copy link
Collaborator Author

@victorli2002 victorli2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @lockshaw)


lib/substitutions/src/substitutions/substitution_builder.cc line 115 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Throw an exception with an actual error message rather than using assert

Done.


lib/utils/include/utils/containers.h line at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Delete this file

Done.


lib/substitutions/include/substitutions/operator_pattern/operator_attribute_constraint.h line 13 at r2 (raw file):

                                               OperatorAttributeValue const &);
OperatorAttributeConstraint op_attr_key_divisible_by(OperatorAttributeKey,
                                                     int denominator);

Done.


lib/substitutions/src/substitutions/substitution_builder.cc line 110 at r2 (raw file):

                            "OutputGraphExprValue holding a "
                            "OutputGraphExprNodeOutput, but received {}",
                            maybe_pattern_output));

Done.


lib/substitutions/include/substitutions/tensor_pattern/tensor_attribute_pattern.h line 9 at r2 (raw file):

TensorAttributePattern tensor_attribute_pattern_match_all();
TensorAttributePattern tensor_attr_pattern_require_num_dims(int num_dims);

Done.


lib/substitutions/src/substitutions/unity_substitution_set.cc line 14 at r2 (raw file):

  std::vector<Substitution> substitutions;
  for (int num_dims : range(MAX_TENSOR_DIM)) {
    for (int degree = 1; degree <= resources.num_nodes; degree *= 2) {

Done.

@victorli2002 victorli2002 force-pushed the merge-substitution-builder branch from f66979b to 47cc58a Compare January 25, 2025 03:26
Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 30 of 31 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @victorli2002)

Copy link
Collaborator Author

@victorli2002 victorli2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 66 of 75 files reviewed, 3 unresolved discussions (waiting on @lockshaw)


lib/utils/include/utils/containers/merge_maps.h line 18 at r2 (raw file):

Previously, victorli2002 (Victor Li) wrote…

I think it should be fine to create three separate functions. Will do that.

the left_dominates form is currently unusued, but it feels weird removing it while keeping right_dominates (which is used in one place)?


lib/utils/include/utils/render_dot.h line 1 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This should probably be in utils/graph

Done.


lib/utils/include/utils/containers/replicate.h line 11 at r2 (raw file):

template <typename T>
std::vector<T> replicate(int n, T const &element) {

Done.


lib/utils/include/utils/containers/replicate.h line 14 at r2 (raw file):

  if (n < 0) {
    throw mk_runtime_error(
        fmt::format("replicate expected n > 0, but received n = {}", n));

Done.

@lockshaw lockshaw self-requested a review February 1, 2025 20:08
@lockshaw lockshaw enabled auto-merge (squash) February 1, 2025 20:10
@lockshaw lockshaw merged commit 2b71235 into flexflow:master Feb 1, 2025
3 of 5 checks passed
@lockshaw lockshaw mentioned this pull request Feb 1, 2025
1 task
dylanllim pushed a commit to dylanllim/FlexFlow that referenced this pull request Feb 5, 2025
* Start on pcg builder

* Add tests and some implementation for pcg builder

* Add pcg tests, make dtgen constructors explicit to fix bug

* Add remainder of PCG tests

* Fix build issues in local-execution

* Format

* Address Reyna comments, add topological_order function for PCG

* Pre multidigraph refactor

* Removing visitable from sp code

* Add open dataflow graph, start to replace pcg dataflow graph

* Start refactoring substitutions

* Add utility functions to support pattern matching

* Pre-refactor inputs

* Fix proj url

* Get back to substitutions, now with unordered graph inputs

* Get substitutions building

* substitutions-tests now builds

* Fix bug in filter, pass some initial substitution tests

* Add tests for fmt::to_string, fix some substitutions bugs

* Pass initial unit tests for find_pattern_matches

* Start on unit tests for pcg pattern

* Pass initial test for find_pattern_matches

* Fix small build issue in tests

* Format

* Sync tests in CI with tests in proj

* Fix minor build errors in kernels and local-execution

* Format

* Remove outdated code

* More outdated code removal

* More cleanup, add test for sp decomposition

* Pull apart containers.h

* More sp testing and fixes

* Break up graph algorithms.h

* Pre- full SP algo commit

* Add initial implementation and tests for cbc decomposition and inverse line graph

* Pass test for get_inverse_line_graph

* Add new multidigraph

* Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph

* Add tests for parallel and series reduction finding

* Add really rough implementation of valdez sp decomposition

* Fix local-execution build

* Add implementations and tests for applying series/parallel reductions

* Format

* Clean up sp decomposition interface and tests

* Format

* Add comments for top-level substitutions functions, add proj doxygen support

* Start sketching out substitutions code

* Fix build errors

* Add ability to permute node ids

* Cleanup and start to test new substitutions code

* Add test case for evaluate_substitution_output

* Add naive isomorphism detection code

* Add graph inputs to open dataflow graph isomorphism

* Add input permutation to evaluate_substitution_output

* Fix permute_node_ids

* Add test for permute_input_ids

* Migrate over to mutable implementation of apply_substitution

* Add fast isomorphism checking and an initial implementation of full substitution logic

* Pass initial full substitutions test

* Cleanup old isomorphism checking code

* Fix post-merge bugs

* Fix broken pcg builder test

* Format

* Reorganize code and remove some outdated code pre-code-review

* Format

* Restarting work on this after working on export-model-arch

* Adding in some a simple function to get the currently available substritutions

* nonnegative_int additions, code cleanup, etc.

* A bunch more moving over to nonnegative_int

* Even more nonnegative_int updating

* Fix build

* Fix failing tests

* Format

* Format

---------

Co-authored-by: Colin Unger <[email protected]>
Co-authored-by: Victor Li <[email protected]>
@victorli2002 victorli2002 mentioned this pull request Mar 12, 2025
@victorli2002 victorli2002 mentioned this pull request Apr 21, 2025
lockshaw added a commit that referenced this pull request May 2, 2025
* test_utils refactor, local_cpu_allocator

* test utils modification, cast, reverse, and replicate cpu kernels

* combine kernel

* combine kernels .h file

* Implementations for methods for machine_views and associated modules  (#1429)

* initial commit for machine view adjacent modules

* Formatting

* Tests for new machine_view.cc functions

* formatting

* Minor Test correction

* formatting

* PR fixes

* PR Fixes

---------

Co-authored-by: Pietro Max Marsella <[email protected]>

* test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes

* cpu_kernel's refactor, generic tensor accessor indexing

* accessor.h formatting

* mk_runtime_error formatting

* reverse_kernels include

* test_utils refactor and clarity

* formatting

* comment removal reverse_kernels

* Issue #1435, tests for managed stream and handle

* #1435 formatting

* #1409 issue, change datatype for linear kernels away from void *

* R & W accessor changes, minimize code bloat

* code formatting and refactor

* issue #1502 & issue #1540

* format check

* branch merge and test fixes

* build issues

* Add AWS linux AMI to runs-on for testing (#1589)

* Pin runs-on images (#1590)

* GPU CI Fix (Pin runs-on GPU image) (#1588)

* Debug

* Change to base DL AMI

* Print disk usage

* Run nvidia-smi

* Remove excess cuda installs in base ami

* Re-enable freeing space in GPU CI

* Try updating nix-develop version

* Check what happens if you just enter the non-nixGL environment

* Try switching AMIs

* Try to remove the module stuff

* Move to lockshaw/develop-action

* Try pointing at a fixed commit

* Update nix-develop action

* Update nix-develop action to use BASH_FUNC filtering

* Remove all the /usr/local/cuda entries

* Switch back to gpu-ci env

* Update the cuda arch

* Try out the new runs-on gpu image

* Move over to pinned runs-on image

* Remove a bunch more unnecessary stuff in image to get back disk space

* Try using an emphemeral store

* Try mounting

* Fix bug

* Try sudo

* Move nix into _work

* Rollback all unnecessary changes

* Re-enable waiting on cpu-ci

* Merge substitution-builder (#1575)

* Start on pcg builder

* Add tests and some implementation for pcg builder

* Add pcg tests, make dtgen constructors explicit to fix bug

* Add remainder of PCG tests

* Fix build issues in local-execution

* Format

* Address Reyna comments, add topological_order function for PCG

* Pre multidigraph refactor

* Removing visitable from sp code

* Add open dataflow graph, start to replace pcg dataflow graph

* Start refactoring substitutions

* Add utility functions to support pattern matching

* Pre-refactor inputs

* Fix proj url

* Get back to substitutions, now with unordered graph inputs

* Get substitutions building

* substitutions-tests now builds

* Fix bug in filter, pass some initial substitution tests

* Add tests for fmt::to_string, fix some substitutions bugs

* Pass initial unit tests for find_pattern_matches

* Start on unit tests for pcg pattern

* Pass initial test for find_pattern_matches

* Fix small build issue in tests

* Format

* Sync tests in CI with tests in proj

* Fix minor build errors in kernels and local-execution

* Format

* Remove outdated code

* More outdated code removal

* More cleanup, add test for sp decomposition

* Pull apart containers.h

* More sp testing and fixes

* Break up graph algorithms.h

* Pre- full SP algo commit

* Add initial implementation and tests for cbc decomposition and inverse line graph

* Pass test for get_inverse_line_graph

* Add new multidigraph

* Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph

* Add tests for parallel and series reduction finding

* Add really rough implementation of valdez sp decomposition

* Fix local-execution build

* Add implementations and tests for applying series/parallel reductions

* Format

* Clean up sp decomposition interface and tests

* Format

* Add comments for top-level substitutions functions, add proj doxygen support

* Start sketching out substitutions code

* Fix build errors

* Add ability to permute node ids

* Cleanup and start to test new substitutions code

* Add test case for evaluate_substitution_output

* Add naive isomorphism detection code

* Add graph inputs to open dataflow graph isomorphism

* Add input permutation to evaluate_substitution_output

* Fix permute_node_ids

* Add test for permute_input_ids

* Migrate over to mutable implementation of apply_substitution

* Add fast isomorphism checking and an initial implementation of full substitution logic

* Pass initial full substitutions test

* Cleanup old isomorphism checking code

* Fix post-merge bugs

* Fix broken pcg builder test

* Format

* Reorganize code and remove some outdated code pre-code-review

* Format

* Restarting work on this after working on export-model-arch

* Adding in some a simple function to get the currently available substritutions

* nonnegative_int additions, code cleanup, etc.

* A bunch more moving over to nonnegative_int

* Even more nonnegative_int updating

* Fix build

* Fix failing tests

* Format

* Format

---------

Co-authored-by: Colin Unger <[email protected]>
Co-authored-by: Victor Li <[email protected]>

* test_utils refactor, local_cpu_allocator

* test utils modification, cast, reverse, and replicate cpu kernels

* combine kernel

* test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes

* cpu_kernel's refactor, generic tensor accessor indexing

* test_utils refactor and clarity

* R & W accessor changes, minimize code bloat

* issue #1502 & issue #1540

* branch merge and test fixes

* merge

* build after merge

* kernel issues

* managed stream / handle test case fix

* test_utils update, kernel/ops refactor

* Review fixes

* Update doctest includes in kernels

* More PR review

* Try using rhel package-based nixgl

* Format

* Update proj with test command fixes

* Attempt to fix gpu CI

* Use custom AMI in GPU CI

* Fix proj bug in cpu-ci

* Try including run id

* Temporarily allow gpu ci to run regardless for testing purposes

* Try using official ubuntu ami in gpu ci

* Try out new ami

* Change to use new flexflow-gpu-ci AMI

* Fix bugs in GPU tests and restore GPU CI gating

* Format

* Fix bug in accessor formatting test cases

* Bugfixes and updated proj

* Fix all cpu tests

* Format

* Add improved test failure output for replicate cpu vs gpu tests

* Continue debugging replicate cuda testcases

* Format

* Fix incorrect tensor size in replicate kernel tests

* Transpose replicate backward cpu kernel

* Try flipping output dimensions in replica cuda kernel test

* Update proj

---------

Co-authored-by: Marsella8 <[email protected]>
Co-authored-by: Pietro Max Marsella <[email protected]>
Co-authored-by: Colin Unger <[email protected]>
Co-authored-by: Victor Li <[email protected]>
Co-authored-by: Victor Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants