Skip to content

Conversation

@shraiysh
Copy link
Contributor

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.

@shraiysh shraiysh requested a review from bchetioui January 13, 2025 18:17
@shraiysh shraiysh self-assigned this Jan 13, 2025
@shraiysh shraiysh added the kokoro:force-run Forces CI to rerun label Jan 13, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Jan 13, 2025
copybara-service bot pushed a commit that referenced this pull request Jan 13, 2025
Imported from GitHub PR #21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
ae85690 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21375 from shraiysh:while_loop_analysis ae85690
PiperOrigin-RevId: 715047991
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 13, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
ae85690876a106c4d74715fed299779e29e8e641 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

Reverts 49b2a87

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis ae85690876a106c4d74715fed299779e29e8e641
PiperOrigin-RevId: 715047991
copybara-service bot pushed a commit that referenced this pull request Jan 13, 2025
Imported from GitHub PR #21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
ae85690 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21375 from shraiysh:while_loop_analysis ae85690
PiperOrigin-RevId: 715047991
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 13, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
ae85690876a106c4d74715fed299779e29e8e641 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis ae85690876a106c4d74715fed299779e29e8e641
PiperOrigin-RevId: 715047991
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 13, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
ae85690876a106c4d74715fed299779e29e8e641 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

PiperOrigin-RevId: 715080653
@fhoushmand
Copy link
Member

Rolling back due to internal test breakage.

@shraiysh
Copy link
Contributor Author

Do we have a reduced testcase or the reason for this breakage? I can investigate and try to submit again.

@shraiysh shraiysh reopened this Jan 13, 2025
@github-actions
Copy link

This PR was rolled back in 3aa5d48!

@shraiysh shraiysh force-pushed the while_loop_analysis branch from ae85690 to 0a7551b Compare January 14, 2025 06:16
@shraiysh shraiysh requested a review from fhoushmand January 14, 2025 06:16
@shraiysh shraiysh added the kokoro:force-run Forces CI to rerun label Jan 14, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Jan 14, 2025
copybara-service bot pushed a commit that referenced this pull request Jan 14, 2025
Imported from GitHub PR #21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
0a7551b by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21375 from shraiysh:while_loop_analysis 0a7551b
PiperOrigin-RevId: 715237085
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 14, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
0a7551bbba67f5900eb0bb5fc4d193f37d5b63ee by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis 0a7551bbba67f5900eb0bb5fc4d193f37d5b63ee
PiperOrigin-RevId: 715237085
@dimitar-asenov
Copy link
Member

I'm working on trying to get a minimal reproducer. In the mean time, here's then failing stack trace:

Check failed at xla/hlo/ir/hlo_computation.cc:1413: DCHECK(changed) 

xla/hlo/ir/hlo_computation.cc:1413 xla::HloComputation::ReplaceInstruction()
xla/hlo/ir/hlo_computation.cc:1381 xla::HloComputation::ReplaceWithNewInstruction()
xla/tools/hlo_extractor.cc:332 xla::(anonymous namespace)::Inline()
xla/tools/hlo_extractor.cc:369 xla::ExtractModule()
xla/hlo/analysis/while_loop_analysis.cc:173 xla::GetUniqueGTEDependenceIndex()
xla/hlo/analysis/while_loop_analysis.cc:432 xla::GetLoopInductionVarTupleIdx()
xla/hlo/analysis/while_loop_analysis.cc:487 xla::MatchTrivialLoopRange()

Enabling extra logging reveals that CHECK fails happens because the replacement fails with:

hlo_computation.cc:1432 Skipping replacement because old instruction has control dependencies

In this particular case the old instruction is a fusion and the new one is a call.

Will share a reproducer once I get a more minimal example.

@shraiysh shraiysh force-pushed the while_loop_analysis branch from 0a7551b to 2d67160 Compare January 14, 2025 18:07
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 720702570
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 720696248
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
…ks using XLA's FFI.

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 720649341
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
…ions.

This is in preparation for a larger change, so that `_check_arrays` can be called before Array creation in XLA and the user gets more helpful JAX error messages instead of XLA errors.

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 720692284
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 718029356
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 716201082
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 28, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
3147ec926aa1c6fdfa2f4376668434c9a2fbeb87 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

--
a435fbd2eadc17269d7bccbe141dcf7a21cc20e8 by Shraiysh Vaishay <[email protected]>:

Relay control dependencies while converting fusion to call (extractor)

Merging this change closes #21375

PiperOrigin-RevId: 720691268
@github-actions
Copy link

This PR was rolled back in 95c45c6!

@akuegel
Copy link
Member

akuegel commented Jan 29, 2025

@shraiysh FYI this triggered a bug in AlgebraicSimplifier, by now I have a fix ready, but we reverted first to unbreak the user as soon as possible. After my fix has landed we will attempt to reland.

@dimitar-asenov
Copy link
Member

The fix has landed and this PR was re-merged. Let's hope there are no more breakages.

@shraiysh
Copy link
Contributor Author

Thanks for helping land this @akuegel @dimitar-asenov! Let me know if I can help if there are future breakages.

@dimitar-asenov
Copy link
Member

Unfortunately, we'll have to roll this back again as it caused another preexisting issue with the Algebraic Simplifier to surface. We'll look into it next week, I don't have a reproducer at the moment.

@shraiysh
Copy link
Contributor Author

Unfortunately, we'll have to roll this back again as it caused another preexisting issue with the Algebraic Simplifier to surface. We'll look into it next week, I don't have a reproducer at the moment.

Sure. No problem. Please let me know how I can help.

@shraiysh shraiysh reopened this Feb 3, 2025
copybara-service bot pushed a commit that referenced this pull request Feb 3, 2025
Imported from GitHub PR #21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
3147ec9 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

--
a435fbd by Shraiysh Vaishay <[email protected]>:

Relay control dependencies while converting fusion to call (extractor)

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21375 from shraiysh:while_loop_analysis a435fbd
PiperOrigin-RevId: 722709283
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Feb 3, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
3147ec926aa1c6fdfa2f4376668434c9a2fbeb87 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

--
a435fbd2eadc17269d7bccbe141dcf7a21cc20e8 by Shraiysh Vaishay <[email protected]>:

Relay control dependencies while converting fusion to call (extractor)

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 722709283
copybara-service bot pushed a commit that referenced this pull request Feb 5, 2025
Imported from GitHub PR #21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
3147ec9 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

--
a435fbd by Shraiysh Vaishay <[email protected]>:

Relay control dependencies while converting fusion to call (extractor)

Merging this change closes #21375

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21375 from shraiysh:while_loop_analysis a435fbd
PiperOrigin-RevId: 722709283
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Feb 5, 2025
Imported from GitHub PR openxla/xla#21375

In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
Copybara import of the project:

--
3147ec926aa1c6fdfa2f4376668434c9a2fbeb87 by Shraiysh Vaishay <[email protected]>:

[ds-fusion] Get While loop analysis with copy fusion

In later stages of optimization, there are instances of copy fusion on
the parameter of the while body. With this, we need to allow inlining of
fusions while getting the induction variable index, otherwise we cannot
deduce the tuple index.

--
a435fbd2eadc17269d7bccbe141dcf7a21cc20e8 by Shraiysh Vaishay <[email protected]>:

Relay control dependencies while converting fusion to call (extractor)

Merging this change closes #21375

Reverts changelist 723246423

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21375 from shraiysh:while_loop_analysis a435fbd2eadc17269d7bccbe141dcf7a21cc20e8
PiperOrigin-RevId: 722709283
@dimitar-asenov
Copy link
Member

This change was re-merged. Let's hope that was the last issue.

@copybara-service copybara-service bot closed this in 581538c Feb 5, 2025
MichaelHudgins pushed a commit to tensorflow/tensorflow that referenced this pull request Feb 5, 2025
723477680  by A. Unique TensorFlower<[email protected]>:

    [XLA] Tag timeout tests as `not_run:arm`

    Similarly to cl/722883015 tagging also:

    * //third_party/tensorflow/compiler/xla/python/transfer:socket_bulk_transport_test
    * //third_party/tensorflow/compiler/xla/python/transfer:socket-server_test
    * //third_party/tensorflow/compiler/xla/python/transfer:event_loop_test

--
723476384  by A. Unique TensorFlower<[email protected]>:

    Parse XLA_FLAGS environment variable every time, conditionally on xla_flags_reset flag.

--
723471749  by A. Unique TensorFlower<[email protected]>:

    [XLA:GPU] Rename `IsSyncCollective` and move to a GPU specific file.

    The implementation is specific to the GPU backend.

--
723470593  by A. Unique TensorFlower<[email protected]>:

    [XLA:GPU] move DotDecompose out of simplification pipeline

    That seems to be a better approach then moving TransposeFold to simplification-2 in 961e5c25fbd4082a1ac4f2e0865ad28163d12f7d:

    1. There is a report that previous change has resulted in perf degradation openxla/xla#22081

    2. I have found another case when DotDecompose is competing with algsimp. Added a test for that.

    Overall, having an pass that expands operation together with passes that are trying to do the simplification asks for such infinite loops.

    ---

    For archeologists:

    passes DotDimensionSorter and DotDecomposer were added along with GpuAlgebraicSimplifier as it previously could have added multiple contracting dimensions to dot. But cudnn does not support dots with 2+ dimensions, forcing us to use a less efficient loop emitter. - That what "// AlgebraicSimplifier may add contracting dimensions to a dot." comment was about.

    After a while simplifier started to use supports_non_canonical_dots to guard against this case. So it should be safe to remove dot decomposer and friends.

--
723469960  by A. Unique TensorFlower<[email protected]>:

    PR #22334: [ROCm] Fix flaky gpu compiler test when building with rocm

    Imported from GitHub PR openxla/xla#22334

    This change fixes the flaky gpu compiler test used to run on rocm CI pipeline gate.
    Triton pipeline was wrongly using the TritonGPUAccelerateMatmul pass which supports cuda only.
    In rocm there is a different pass which is now used in the rocm pipeline.

    https://github.com/triton-lang/triton/blob/main/third_party/amd/lib/TritonAMDGPUTransforms/AccelerateAMDMatmul.cpp
    Copybara import of the project:

    --
    c5f600f03aa87d155bb624bedb0584e635af190e by Alexandros Theodoridis <[email protected]>:

    Fix flaky gpu compiler test when building with rocm

    Merging this change closes #22334

--
723453199  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723445422  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723443292  by A. Unique TensorFlower<[email protected]>:

    [pjrt] Removed deprecated `PjRtBuffer::CopyToDevice`

--
723434255  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723430683  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723426786  by A. Unique TensorFlower<[email protected]>:

    PR #22258: [GPU][NFC] Avoid always printing complete PGLE profiles.

    Imported from GitHub PR openxla/xla#22258

    Copybara import of the project:

    --
    025352635a155e447559d83c471369559aad5981 by Ilia Sergachev <[email protected]>:

    [GPU][NFC] Avoid always printing complete PGLE profiles.

    Merging this change closes #22258

--
723426773  by A. Unique TensorFlower<[email protected]>:

    PR #21375: [ds-fusion] Get While loop analysis with copy fusion

    Imported from GitHub PR openxla/xla#21375

    In later stages of optimization, there are instances of copy fusion on the parameter of the while body. With this, we need to allow inlining of fusions while getting the induction variable index, otherwise we cannot deduce the tuple index.
    Copybara import of the project:

    --
    3147ec926aa1c6fdfa2f4376668434c9a2fbeb87 by Shraiysh Vaishay <[email protected]>:

    [ds-fusion] Get While loop analysis with copy fusion

    In later stages of optimization, there are instances of copy fusion on
    the parameter of the while body. With this, we need to allow inlining of
    fusions while getting the induction variable index, otherwise we cannot
    deduce the tuple index.

    --
    a435fbd2eadc17269d7bccbe141dcf7a21cc20e8 by Shraiysh Vaishay <[email protected]>:

    Relay control dependencies while converting fusion to call (extractor)

    Merging this change closes #21375

--
723425710  by A. Unique TensorFlower<[email protected]>:

    [XLA] Add const reference versions of `ForEachInstructionWithPred` and `ForEachInstructionWithOpcode`.

    These are more permissive and semantically equivalent.

--
723425622  by A. Unique TensorFlower<[email protected]>:

    Remove dead code (NFC)

    We compute the total number of tiles in a variable `num_tiles` but then never
    use it. So remove it.

--
723419822  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723402058  by A. Unique TensorFlower<[email protected]>:

    compat: Update forward compatibility horizon to 2025-02-05

--
723401869  by A. Unique TensorFlower<[email protected]>:

    Update GraphDef version to 2129.

--
723396271  by A. Unique TensorFlower<[email protected]>:

    [XLA] Support different operand and result types in AlgebraicSimplifierVisitor::HandlePad.

    I checked that none of the other cases in HandlePad require any adjustments.

--
723389764  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723370461  by A. Unique TensorFlower<[email protected]>:

    Use matchers_oss in vendor code

--
723367856  by A. Unique TensorFlower<[email protected]>:

    Update users of TSL headers and targets to new location in XLA

    Updating:
     - `env.h`
     - `env_time.h`
     - `errors.h`
     - `file_statistics.h`
     - `file_system.h`
     - `file_system_helper.h`
     - `logging.h`
     - `macros.h`
     - `status.h`
     - `status_matchers.h`
     - `status_to_from_proto.h`
     - `statusor.h`
     - `test.h`
     - `test_benchmark.h`
     - `threadpool.h`
     - `threadpool_async_executor.h`
     - `threadpool_interface.h`
     - `threadpool_options.h`
     - `types.h`

    and associated targets.

--
723349025  by A. Unique TensorFlower<[email protected]>:

    Fix inference request analysis aggregated on batch size, by aggregating only the requests included in a single batch, as large request split into multiple batches will introduce confusing results (eg. the device time will be the sum of the 2 batch processing).

--
723344172  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723340771  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723337100  by A. Unique TensorFlower<[email protected]>:

    Automated Code Change

--
723321370  by A. Unique TensorFlower<[email protected]>:

    Stop modifying the TraceEventsContainer in DoStoreAsLevelDbTable. This behavior
    is not intuitive (modifying a const value that was passed in) and unnecessary.

--
723307829  by A. Unique TensorFlower<[email protected]>:
    Automated rollback of changelist 723246423.

723278167  by A. Unique TensorFlower<[email protected]>:

    Update users of TSL headers and targets to new location in XLA

    Updating:
     - `env.h`
     - `env_time.h`
     - `errors.h`
     - `file_statistics.h`
     - `file_system.h`
     - `file_system_helper.h`
     - `logging.h`
     - `macros.h`
     - `status.h`
     - `status_matchers.h`
     - `status_to_from_proto.h`
     - `statusor.h`
     - `test.h`
     - `test_benchmark.h`
     - `threadpool.h`
     - `threadpool_async_executor.h`
     - `threadpool_interface.h`
     - `threadpool_options.h`
     - `types.h`

    and associated targets.

--
723265881  by A. Unique TensorFlower<[email protected]>:

    Add the list of Qualcomm SoCs supporting NPU.

--
723248792  by A. Unique TensorFlower<[email protected]>:

    Add Q/DQ annotation lowering support.

    LowerQuantAnnotationsPass now supports quant.quantize and quant.dequantize composite lowering. These patterns make adjustments to the function signatures if necessary.

--
723246423  by A. Unique TensorFlower<[email protected]>:

    PR #85476: Support Qnn Wrappers for LiteRt

    Imported from GitHub PR #85476

    # WHAT
    - Basic wrapper for QNN types, handle dynamic resources along with wrapper instances.
    - Make these wrappers independent to LiteRT/tflite
    - Only depend on QNN and STL

    ### `ScalarParamWrapper`
    - Wrap `Qnn_Param_t` with `QNN_PARAMTYPE_SCALAR` for `paramType`
    - Choose correct `QNN_DATATYPE` based on the data type

    ### `TensorParamWrapper`
    - Wrap `Qnn_Param_t` with `QNN_PARAMTYPE_TENSOR` for `paramType`

    ### `UndefinedQuantizeParamsWrapper`
    - Wrap `Qnn_QuantizeParams_t`
    - Default for quantization parameter

    ### `ScaleOffsetQuantizeParamsWrapper`
    - Wrap `Qnn_QuantizeParams_t` for per-tensor quantization

    ### `AxisScaleOffsetQuantizeParamsWrapper`
    - Wrap `Qnn_QuantizeParams_t`  for per-axis quantization

    ### `TensorWrapper`
    - Wrap `Qnn_TensorType_t`
    - Handle dynamic resource, e.g. name, dimensions, weight data.

    ### `OpWrapper`
    - Wrap `Qnn_OpConfig_t`
    - Handle dynamic resource, e.g. name, input output tensors, params
    Copybara import of the project:

    --
    4833a20 by weilhuan-quic <[email protected]>:

    LiteRt Qualcomm wrappers

    --
    725f571 by weilhuan-quic <[email protected]>:

    TensorWrapper GetDataTypeSize() return bytes instead of bits

    --
    dd3f251 by weilhuan-quic <[email protected]>:

    comment qnn_lib_headers

    --
    06e0616 by weilhuan-quic <[email protected]>:

    Change license

    Merging this change closes #85476

--

PiperOrigin-RevId: 723477680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants