-
Notifications
You must be signed in to change notification settings - Fork 245
Change datatype for linear kernels away from void * in .cc #1409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dylanllim
added a commit
to dylanllim/FlexFlow
that referenced
this pull request
Oct 16, 2024
dylanllim
added a commit
that referenced
this pull request
Jan 28, 2025
dylanllim
added a commit
to dylanllim/FlexFlow
that referenced
this pull request
Jan 28, 2025
lockshaw
added a commit
that referenced
this pull request
May 2, 2025
* test_utils refactor, local_cpu_allocator * test utils modification, cast, reverse, and replicate cpu kernels * combine kernel * combine kernels .h file * Implementations for methods for machine_views and associated modules (#1429) * initial commit for machine view adjacent modules * Formatting * Tests for new machine_view.cc functions * formatting * Minor Test correction * formatting * PR fixes * PR Fixes --------- Co-authored-by: Pietro Max Marsella <[email protected]> * test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes * cpu_kernel's refactor, generic tensor accessor indexing * accessor.h formatting * mk_runtime_error formatting * reverse_kernels include * test_utils refactor and clarity * formatting * comment removal reverse_kernels * Issue #1435, tests for managed stream and handle * #1435 formatting * #1409 issue, change datatype for linear kernels away from void * * R & W accessor changes, minimize code bloat * code formatting and refactor * issue #1502 & issue #1540 * format check * branch merge and test fixes * build issues * Add AWS linux AMI to runs-on for testing (#1589) * Pin runs-on images (#1590) * GPU CI Fix (Pin runs-on GPU image) (#1588) * Debug * Change to base DL AMI * Print disk usage * Run nvidia-smi * Remove excess cuda installs in base ami * Re-enable freeing space in GPU CI * Try updating nix-develop version * Check what happens if you just enter the non-nixGL environment * Try switching AMIs * Try to remove the module stuff * Move to lockshaw/develop-action * Try pointing at a fixed commit * Update nix-develop action * Update nix-develop action to use BASH_FUNC filtering * Remove all the /usr/local/cuda entries * Switch back to gpu-ci env * Update the cuda arch * Try out the new runs-on gpu image * Move over to pinned runs-on image * Remove a bunch more unnecessary stuff in image to get back disk space * Try using an emphemeral store * Try mounting * Fix bug * Try sudo * Move nix into _work * Rollback all unnecessary changes * Re-enable waiting on cpu-ci * Merge substitution-builder (#1575) * Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Format * Clean up sp decomposition interface and tests * Format * Add comments for top-level substitutions functions, add proj doxygen support * Start sketching out substitutions code * Fix build errors * Add ability to permute node ids * Cleanup and start to test new substitutions code * Add test case for evaluate_substitution_output * Add naive isomorphism detection code * Add graph inputs to open dataflow graph isomorphism * Add input permutation to evaluate_substitution_output * Fix permute_node_ids * Add test for permute_input_ids * Migrate over to mutable implementation of apply_substitution * Add fast isomorphism checking and an initial implementation of full substitution logic * Pass initial full substitutions test * Cleanup old isomorphism checking code * Fix post-merge bugs * Fix broken pcg builder test * Format * Reorganize code and remove some outdated code pre-code-review * Format * Restarting work on this after working on export-model-arch * Adding in some a simple function to get the currently available substritutions * nonnegative_int additions, code cleanup, etc. * A bunch more moving over to nonnegative_int * Even more nonnegative_int updating * Fix build * Fix failing tests * Format * Format --------- Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Victor Li <[email protected]> * test_utils refactor, local_cpu_allocator * test utils modification, cast, reverse, and replicate cpu kernels * combine kernel * test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes * cpu_kernel's refactor, generic tensor accessor indexing * test_utils refactor and clarity * R & W accessor changes, minimize code bloat * issue #1502 & issue #1540 * branch merge and test fixes * merge * build after merge * kernel issues * managed stream / handle test case fix * test_utils update, kernel/ops refactor * Review fixes * Update doctest includes in kernels * More PR review * Try using rhel package-based nixgl * Format * Update proj with test command fixes * Attempt to fix gpu CI * Use custom AMI in GPU CI * Fix proj bug in cpu-ci * Try including run id * Temporarily allow gpu ci to run regardless for testing purposes * Try using official ubuntu ami in gpu ci * Try out new ami * Change to use new flexflow-gpu-ci AMI * Fix bugs in GPU tests and restore GPU CI gating * Format * Fix bug in accessor formatting test cases * Bugfixes and updated proj * Fix all cpu tests * Format * Add improved test failure output for replicate cpu vs gpu tests * Continue debugging replicate cuda testcases * Format * Fix incorrect tensor size in replicate kernel tests * Transpose replicate backward cpu kernel * Try flipping output dimensions in replica cuda kernel test * Update proj --------- Co-authored-by: Marsella8 <[email protected]> Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Victor Li <[email protected]> Co-authored-by: Victor Li <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cublasGemmEx takes a bunch of void * and then the appropriate datatype. Moves the cast for the actual datatype to void * in the kernel code.
Linked Issues:
void *in.cc#1397This change is