Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Mar 11, 2025

Link: llvm/llvm-project#130742
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Mar 11, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Mar 11, 2025

Diff mode

runner: ariselab-64c-v2
baseline: llvm/llvm-project@1ddf180
patch: llvm/llvm-project#130742
sha256: c728f3057414efa0b34de706eebab4e0f83853f8c3ab2f5f699c1c4b27bd1c37
commit: 1b4f77c

2392 files changed, 2485570 insertions(+), 2513022 deletions(-)

Improvements:
  dse.NumRedundantStores 33619 -> 39537 +17.60%
  dse.NumCFGSuccess 7270 -> 8018 +10.29%
  dse.NumCompletePartials 24672 -> 25813 +4.62%
  dse.NumFastOther 138299 -> 142146 +2.78%
  dse.NumGetDomMemoryDefPassed 1149882 -> 1153261 +0.29%
  dse.NumFastStores 948277 -> 949384 +0.12%
  instcombine.NegatorTotalNegationsAttempted 19285826 -> 19298131 +0.06%
  instcombine.NegatorNumValuesVisited 20045639 -> 20058134 +0.06%
  instcombine.NumTwoIterations 16929904 -> 16937621 +0.05%
  instcombine.NumSunkInst 2880923 -> 2882099 +0.04%
Regressions:
  dse.NumCFGTries 47260 -> 45552 -3.61%
  dse.NumCFGChecks 495246 -> 480287 -3.02%
  correlated-value-propagation.NumAShrsConverted 4458 -> 4373 -1.91%
  correlated-value-propagation.NumAShrsRemoved 193 -> 190 -1.55%
  gvn.NumGVNEqProp 355605 -> 352119 -0.98%
  local.NumPHICSEs 157194 -> 156829 -0.23%
  bdce.NumRemoved 334828 -> 334542 -0.09%
  gvn.NumGVNSimpl 4091238 -> 4088041 -0.08%
  memcpyopt.NumMoveToCpy 14828 -> 14817 -0.07%
  instcombine.NumPHICSEs 2190265 -> 2188788 -0.07%

8 15 bench/abc/optimized/giaTtopt.ll
4 3 bench/abseil-cpp/optimized/parse.ll
11 10 bench/abseil-cpp/optimized/time_zone_info.ll
15 21 bench/arrow/optimized/codegen_internal.ll
6 5 bench/arrow/optimized/datetime.ll
10 16 bench/arrow/optimized/diff.ll
2 3 bench/arrow/optimized/vector_selection.ll
3 7 bench/assimp/optimized/LWSLoader.ll
4 7 bench/assimp/optimized/clipper.ll
7 10 bench/boost/optimized/shared_work.ll
2 3 bench/casadi/optimized/c_api_usage.ll
3 4 bench/casadi/optimized/dae_builder.ll
24 30 bench/ceres/optimized/c_api.ll
2 5 bench/ceres/optimized/schur_complement_solver.ll
2 3 bench/cmake/optimized/cmCTestMemCheckHandler.ll
2 3 bench/cmake/optimized/cmStateSnapshot.ll
1 4 bench/cvc5/optimized/conjecture_generator.ll
14 28 bench/cvc5/optimized/node_manager.ll
4 7 bench/darktable/optimized/Camera.ll
29 35 bench/darktable/optimized/TiffEntry.ll
3 4 bench/draco/optimized/point_attribute.ll
3 2 bench/duckdb/optimized/filtered_re2.ll
2 3 bench/eastl/optimized/TestAlgorithm.ll
3 6 bench/entt/optimized/meta_context.ll
8 6 bench/entt/optimized/storage_entity.ll
3 6 bench/faiss/optimized/IndexBinaryHash.ll
8 14 bench/folly/optimized/AsyncSocket.ll
24 26 bench/folly/optimized/TimeoutManager.ll
2 6 bench/gromacs/optimized/colvar.ll
2 6 bench/gromacs/optimized/colvarvalue.ll
3 4 bench/gromacs/optimized/densityfittingforce.ll
3 9 bench/gromacs/optimized/integrator.ll
2 3 bench/gromacs/optimized/manager.ll
4 6 bench/grpc/optimized/evaluate_args.ll
4 3 bench/grpc/optimized/hpack_parser.ll
1 2 bench/hermes/optimized/FileCheck.ll
2 1 bench/hermes/optimized/ISel.ll
1 6 bench/hermes/optimized/escape.ll
2 3 bench/hyperscan/optimized/ComponentRepeat.ll
9 14 bench/hyperscan/optimized/limex_64.ll
3 6 bench/hyperscan/optimized/rose_build_anchored.ll
22 25 bench/icu/optimized/icuexportdata.ll
4 6 bench/jsonnet/optimized/desugarer.ll
4 10 bench/libquic/optimized/ip_address.ll
2 6 bench/libquic/optimized/quic_protocol.ll
17 23 bench/libzmq/optimized/options.ll
2 5 bench/lief/optimized/DynamicEntryArray.ll
11 12 bench/lief/optimized/internal_utils.ll
3 6 bench/lightgbm/optimized/boosting.ll
24 29 bench/llama.cpp/optimized/sampling.ll
25 24 bench/llvm/optimized/Compilation.ll
23 38 bench/llvm/optimized/MachineLICM.ll
7 22 bench/llvm/optimized/ReachingDefAnalysis.ll
5 4 bench/llvm/optimized/RegisterUsageInfo.ll
18 33 bench/llvm/optimized/VarLenCodeEmitterGen.ll
4 7 bench/meshlab/optimized/baseio.ll
1 5 bench/minetest/optimized/CGUIFont.ll
2 8 bench/minetest/optimized/mapblock_mesh.ll
1 9 bench/minetest/optimized/server.ll
1 4 bench/minetest/optimized/serverenvironment.ll
7 8 bench/mold/optimized/lto-unix.cc.X86_64.ll
3 6 bench/ncnn/optimized/benchncnn.ll
5 7 bench/ninja/optimized/build_test.ll
4 6 bench/nix/optimized/lock.ll
2 6 bench/nlohmann_json/optimized/unit-testsuites.ll
1 5 bench/nlohmann_json/optimized/use_v3_10_5.ll
8 9 bench/node/optimized/libnode.Protocol.ll
3 2 bench/node/optimized/libnode.crypto_tls.ll
7 10 bench/node/optimized/libnode.histogram.ll
2 12 bench/node/optimized/libnode.node_contextify.ll
2 5 bench/node/optimized/libnode.node_messaging.ll
5 14 bench/node/optimized/libnode.node_trace_buffer.ll
20 23 bench/ocio/optimized/NoOps.ll
3 4 bench/opencv/optimized/block_mean_hash.ll
5 7 bench/opencv/optimized/objectnessBING.ll
3 5 bench/opencv/optimized/pct_sampler.ll
1 1 bench/openjdk/optimized/os_linux.ll
2 3 bench/openspiel/optimized/cfr_br_test.ll
11 7 bench/openusd/optimized/patchMap.ll
9 11 bench/openusd/optimized/textureUtils.ll
3 7 bench/openusd/optimized/valueTypeRegistry.ll
3 9 bench/openvdb/optimized/Archive.ll
0 4 bench/openvdb/optimized/Queue.ll
4 9 bench/ozz-animation/optimized/gltf2ozz.ll
29 31 bench/pbrt-v4/optimized/lights.ll
4 7 bench/pocketpy/optimized/vm.ll
0 3 bench/quantlib/optimized/barrieroption.ll
3 6 bench/re2/optimized/set.ll
2 3 bench/rocksdb/optimized/backup_engine.ll
6 10 bench/rocksdb/optimized/tiered_secondary_cache.ll
12 20 bench/rocksdb/optimized/trace_record.ll
3 9 bench/sentencepiece/optimized/bpe_model_trainer.ll
3 9 bench/spdlog/optimized/async.ll
10 16 bench/spdlog/optimized/color_sinks.ll
3 9 bench/spdlog/optimized/spdlog.ll
2 6 bench/stb/optimized/stb_sprintf.ll
4 6 bench/stockfish/optimized/uci.ll
1 5 bench/taskflow/optimized/subflow.ll
5 7 bench/tomlplusplus/optimized/toml.ll
2 3 bench/vcpkg/optimized/commands.build.ll
26 39 bench/vcpkg/optimized/commands.list.ll
10 16 bench/vcpkg/optimized/vcpkgpaths.ll
31 43 bench/wasmedge/optimized/canon.ll
24 22 bench/wasmedge/optimized/compiler.ll
9 11 bench/wasmedge/optimized/environ.ll
16 10 bench/xgboost/optimized/gradient_index_format.ll
38 41 bench/xgboost/optimized/quantile_dmatrix.ll
25 37 bench/yalantinglibs/optimized/benchmark.ll
1 7 bench/yalantinglibs/optimized/channel.ll
6 12 bench/yalantinglibs/optimized/data_gen.ll
2 3 bench/yaml-cpp/optimized/regex_yaml.ll
11 14 bench/yoga/optimized/YGNodeStyle.ll
14 17 bench/yosys/optimized/json11.ll
40 46 bench/yosys/optimized/ql_dsp_io_regs.ll
20 18 bench/z3/optimized/nla_core.ll
2 6 bench/zxing/optimized/PDFModulusGF.ll
3 4 bench/zxing/optimized/QRVersion.ll
4 3 bench/zxing/optimized/ReedSolomonDecoder.ll

@github-actions
Copy link
Contributor

Summary of Major Changes in the LLVM IR Patch

Below is a high-level overview of up to five major changes observed across multiple files in the patch:

  1. Optimization of llvm.memset Calls:

    • Several instances of llvm.memset calls have been modified to operate on larger memory regions (e.g., from 16 bytes to 24 bytes). This change consolidates memory initialization logic, reducing the number of separate instructions and potentially improving performance by simplifying the initialization process.
    • Example: In giaTtopt.ll, the llvm.memset call now initializes 24 bytes instead of two separate 16-byte and 8-byte initializations.
  2. Removal of Redundant getelementptr Instructions:

    • Many redundant getelementptr instructions that compute offsets for null pointers have been eliminated. Instead, these offsets are directly handled within the llvm.memset or other memory-related operations.
    • Example: In parse.ll, the getelementptr for a null pointer (null) is removed, and the llvm.memset call handles the entire initialization.
  3. Simplification of PHI Nodes:

    • PHI nodes have been simplified by replacing them with direct values where possible. For example, instead of using a PHI node to select between a computed value and null, the code now directly uses null when appropriate.
    • Example: In colvarvalue.ll, the PHI node %69 is replaced with a direct null value in one branch.
  4. Consolidation of Memory Management Logic:

    • Memory management logic has been streamlined by combining multiple store operations into a single llvm.memset call. This reduces the complexity of the code and improves readability.
    • Example: In darktable/optimized/TiffEntry.ll, the store ptr %null and llvm.memset calls are combined into a single llvm.memset call for 24 bytes.
  5. Alignment of Pointer Arithmetic:

    • Pointer arithmetic has been adjusted to align with modern C++ standards and best practices. Specifically, some getelementptr instructions now use nuw (no unsigned wrap) and nsw (no signed wrap) attributes, ensuring safer and more predictable behavior.
    • Example: In node/optimized/libnode.node_contextify.ll, the getelementptr instructions are updated to include nuw attributes, improving the robustness of pointer calculations.

High-Level Overview

The patch primarily focuses on optimizing memory initialization and management in various C++ standard library implementations, such as std::vector and std::unique_ptr. By consolidating llvm.memset calls, removing redundant instructions, and simplifying PHI nodes, the code achieves better performance and reduced complexity. Additionally, pointer arithmetic is refined to ensure correctness and adherence to modern LLVM standards. These changes collectively improve the efficiency and maintainability of the generated LLVM IR.

model: qwen-plus-latest
CompletionUsage(completion_tokens=624, prompt_tokens=104518, total_tokens=105142, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw closed this Mar 12, 2025
@dtcxzyw dtcxzyw deleted the test-run13785452877 branch March 12, 2025 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant