Change datatype for linear kernels away from void * in .cc #1409

dylanllim · 2024-06-07T19:17:48Z

cublasGemmEx takes a bunch of void * and then the appropriate datatype. Moves the cast for the actual datatype to void * in the kernel code.

Linked Issues:

Issue Change datatype for linear kernels away from void * in .cc #1397

This change is

…o local-op-refactor

* test_utils refactor, local_cpu_allocator * test utils modification, cast, reverse, and replicate cpu kernels * combine kernel * combine kernels .h file * Implementations for methods for machine_views and associated modules (#1429) * initial commit for machine view adjacent modules * Formatting * Tests for new machine_view.cc functions * formatting * Minor Test correction * formatting * PR fixes * PR Fixes --------- Co-authored-by: Pietro Max Marsella <[email protected]> * test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes * cpu_kernel's refactor, generic tensor accessor indexing * accessor.h formatting * mk_runtime_error formatting * reverse_kernels include * test_utils refactor and clarity * formatting * comment removal reverse_kernels * Issue #1435, tests for managed stream and handle * #1435 formatting * #1409 issue, change datatype for linear kernels away from void * * R & W accessor changes, minimize code bloat * code formatting and refactor * issue #1502 & issue #1540 * format check * branch merge and test fixes * build issues * Add AWS linux AMI to runs-on for testing (#1589) * Pin runs-on images (#1590) * GPU CI Fix (Pin runs-on GPU image) (#1588) * Debug * Change to base DL AMI * Print disk usage * Run nvidia-smi * Remove excess cuda installs in base ami * Re-enable freeing space in GPU CI * Try updating nix-develop version * Check what happens if you just enter the non-nixGL environment * Try switching AMIs * Try to remove the module stuff * Move to lockshaw/develop-action * Try pointing at a fixed commit * Update nix-develop action * Update nix-develop action to use BASH_FUNC filtering * Remove all the /usr/local/cuda entries * Switch back to gpu-ci env * Update the cuda arch * Try out the new runs-on gpu image * Move over to pinned runs-on image * Remove a bunch more unnecessary stuff in image to get back disk space * Try using an emphemeral store * Try mounting * Fix bug * Try sudo * Move nix into _work * Rollback all unnecessary changes * Re-enable waiting on cpu-ci * Merge substitution-builder (#1575) * Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Format * Clean up sp decomposition interface and tests * Format * Add comments for top-level substitutions functions, add proj doxygen support * Start sketching out substitutions code * Fix build errors * Add ability to permute node ids * Cleanup and start to test new substitutions code * Add test case for evaluate_substitution_output * Add naive isomorphism detection code * Add graph inputs to open dataflow graph isomorphism * Add input permutation to evaluate_substitution_output * Fix permute_node_ids * Add test for permute_input_ids * Migrate over to mutable implementation of apply_substitution * Add fast isomorphism checking and an initial implementation of full substitution logic * Pass initial full substitutions test * Cleanup old isomorphism checking code * Fix post-merge bugs * Fix broken pcg builder test * Format * Reorganize code and remove some outdated code pre-code-review * Format * Restarting work on this after working on export-model-arch * Adding in some a simple function to get the currently available substritutions * nonnegative_int additions, code cleanup, etc. * A bunch more moving over to nonnegative_int * Even more nonnegative_int updating * Fix build * Fix failing tests * Format * Format --------- Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Victor Li <[email protected]> * test_utils refactor, local_cpu_allocator * test utils modification, cast, reverse, and replicate cpu kernels * combine kernel * test utils logic cleanup, reverse cpu_kernel pedagogical implmentation, other minor fixes * cpu_kernel's refactor, generic tensor accessor indexing * test_utils refactor and clarity * R & W accessor changes, minimize code bloat * issue #1502 & issue #1540 * branch merge and test fixes * merge * build after merge * kernel issues * managed stream / handle test case fix * test_utils update, kernel/ops refactor * Review fixes * Update doctest includes in kernels * More PR review * Try using rhel package-based nixgl * Format * Update proj with test command fixes * Attempt to fix gpu CI * Use custom AMI in GPU CI * Fix proj bug in cpu-ci * Try including run id * Temporarily allow gpu ci to run regardless for testing purposes * Try using official ubuntu ami in gpu ci * Try out new ami * Change to use new flexflow-gpu-ci AMI * Fix bugs in GPU tests and restore GPU CI gating * Format * Fix bug in accessor formatting test cases * Bugfixes and updated proj * Fix all cpu tests * Format * Add improved test failure output for replicate cpu vs gpu tests * Continue debugging replicate cuda testcases * Format * Fix incorrect tensor size in replicate kernel tests * Transpose replicate backward cpu kernel * Try flipping output dimensions in replica cuda kernel test * Update proj --------- Co-authored-by: Marsella8 <[email protected]> Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Victor Li <[email protected]> Co-authored-by: Victor Li <[email protected]>

reyna-abhyankar and others added 30 commits May 10, 2024 10:44

Add allocators

2dc4c60

Computation Graph and Builder

2488514

Shift ops and remove legion names

9a59f34

Format

931b47c

Format

8a66ed9

Fix tracked allocator

3ffe239

Fix comp graph

da10906

Merge branch 'repo-refactor' into comp-graph

ae864ae

Merge branch 'repo-refactor' into local-allocator

30330b7

Merge branch 'repo-refactor' into op-refactor

da701bf

Merge branch 'local-allocator' into op-refactor

15fbcc8

Merge branch 'comp-graph' into op-refactor

db6e3ec

Add task spec

784742c

Merge branch 'comp-graph' into op-refactor

036dbf6

Merge branch 'local-allocator' into op-refactor

5fbb6a3

Minor build issues

905bdd1

Merge branch 'op-refactor' of github.com:reyna-abhyankar/FlexFlow int…

13e6ce2

…o local-op-refactor

Build op task spec

3a3684e

Build ops and op task spec

a4dd9d4

Simplify edge set obtain

5bc719f

Merge branch 'comp-graph' into op-refactor

c8bb9ad

Format

583b2d3

Merge branch 'repo-refactor' into op-refactor

e0e5fe2

Fixes

269557e

Merge branch 'repo-refactor' into op-refactor

be791ad

Fix conflicts, some renaming

269770a

Merge branch 'repo-refactor' into op-refactor

5093acb

Fix gather kernels

2fbf291

Finish gather operator

a2a7e0a

Format

e0b259c

reyna-abhyankar and others added 9 commits May 30, 2024 20:11

Fix substitutions

55971f2

Merge branch 'repo-refactor' into op-refactor

89afe2c

Fix legion dim in gather

da38f0a

Merge branch 'repo-refactor' into op-refactor

286c8ae

Format string fixes

5f539a3

Fix include

26ddf7f

Gather backward time

1dfc24e

Format

c60efd9

Change datatype for linear kernels away from void *

c7b48dd

dylanllim closed this Jun 7, 2024

dylanllim deleted the linear_kernel_update branch June 7, 2024 19:18

dylanllim added a commit to dylanllim/FlexFlow that referenced this pull request Oct 16, 2024

flexflow#1409 issue, change datatype for linear kernels away from void *

7106dec

dylanllim added a commit that referenced this pull request Jan 28, 2025

#1409 issue, change datatype for linear kernels away from void *

54b3888

dylanllim added a commit to dylanllim/FlexFlow that referenced this pull request Jan 28, 2025

flexflow#1409 issue, change datatype for linear kernels away from void *

a182865

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change datatype for linear kernels away from void * in .cc #1409

Change datatype for linear kernels away from void * in .cc #1409

Uh oh!

dylanllim commented Jun 7, 2024 •

edited by wmdi

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Change datatype for linear kernels away from void * in .cc #1409

Change datatype for linear kernels away from void * in .cc #1409

Uh oh!

Conversation

dylanllim commented Jun 7, 2024 • edited by wmdi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dylanllim commented Jun 7, 2024 •

edited by wmdi

Loading