Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to accelerate the compilation? #3122

Open
zhuzhzh opened this issue Jul 3, 2024 · 3 comments
Open

Is there any way to accelerate the compilation? #3122

zhuzhzh opened this issue Jul 3, 2024 · 3 comments

Comments

@zhuzhzh
Copy link
Contributor

zhuzhzh commented Jul 3, 2024

% time make
g++ -g example.cpp -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB -I/home/public/spdlog/include -lpthread -o logger
make  3.43s user 0.26s system 97% cpu 3.782 total

the example.cpp is the built-in example.

I precompile the spdlog intot the shared lib.

Here is the profile.

% g++ -g example.cpp -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB - -I/home/public/spdlog/include -lpthread -o logger -ftime-report

Time variable                                   usr           sys          wall           GGC
 phase setup                        :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)  1562k (  0%)
 phase parsing                      :   1.13 ( 26%)   1.06 ( 52%)   2.20 ( 34%)   160M ( 40%)
 phase lang. deferred               :   0.80 ( 18%)   0.35 ( 17%)   1.15 ( 18%)    85M ( 21%)
 phase opt and generate             :   2.28 ( 53%)   0.62 ( 30%)   2.90 ( 45%)   150M ( 38%)
 phase last asm                     :   0.12 (  3%)   0.00 (  0%)   0.13 (  2%)  1725k (  0%)
 phase finalize                     :   0.01 (  0%)   0.01 (  0%)   0.00 (  0%)     0  (  0%)
 |name lookup                       :   0.35 (  8%)   0.26 ( 13%)   0.65 ( 10%)  6882k (  2%)
 |overload resolution               :   0.52 ( 12%)   0.24 ( 12%)   0.88 ( 14%)    59M ( 15%)
 garbage collection                 :   0.31 (  7%)   0.00 (  0%)   0.30 (  5%)     0  (  0%)
 dump files                         :   0.15 (  3%)   0.07 (  3%)   0.22 (  3%)     0  (  0%)
 callgraph construction             :   0.19 (  4%)   0.02 (  1%)   0.22 (  3%)    23M (  6%)
 callgraph optimization             :   0.04 (  1%)   0.05 (  2%)   0.05 (  1%)  4728  (  0%)
 callgraph ipa passes               :   0.16 (  4%)   0.20 ( 10%)   0.35 (  5%)    12M (  3%)
 ipa function summary               :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)   242k (  0%)
 ipa inheritance graph              :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)    14k (  0%)
 ipa inlining heuristics            :   0.02 (  0%)   0.00 (  0%)   0.05 (  1%)   384  (  0%)
 ipa pure const                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 cfg construction                   :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)   598k (  0%)
 cfg cleanup                        :   0.03 (  1%)   0.00 (  0%)   0.02 (  0%)    20k (  0%)
 trivially dead code                :   0.01 (  0%)   0.01 (  0%)   0.02 (  0%)     0  (  0%)
 df scan insns                      :   0.09 (  2%)   0.01 (  0%)   0.15 (  2%)   114k (  0%)
 df live regs                       :   0.04 (  1%)   0.01 (  0%)   0.05 (  1%)     0  (  0%)
 df reg dead/unused notes           :   0.03 (  1%)   0.02 (  1%)   0.03 (  0%)  1178k (  0%)
 register information               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 alias analysis                     :   0.03 (  1%)   0.00 (  0%)   0.00 (  0%)   517k (  0%)
 register scan                      :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  1280  (  0%)
 rebuild jump labels                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)   240  (  0%)
 preprocessing                      :   0.09 (  2%)   0.20 ( 10%)   0.39 (  6%)  3079k (  1%)
 parser (global)                    :   0.19 (  4%)   0.30 ( 15%)   0.37 (  6%)    31M (  8%)
 parser struct body                 :   0.07 (  2%)   0.08 (  4%)   0.20 (  3%)    20M (  5%)
 parser enumerator list             :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)   268k (  0%)
 parser function body               :   0.08 (  2%)   0.09 (  4%)   0.20 (  3%)  5444k (  1%)
 parser inl. func. body             :   0.05 (  1%)   0.04 (  2%)   0.06 (  1%)  6614k (  2%)
 parser inl. meth. body             :   0.20 (  5%)   0.10 (  5%)   0.18 (  3%)    13M (  3%)
 template instantiation             :   1.15 ( 26%)   0.54 ( 26%)   1.69 ( 26%)   132M ( 33%)
 constant expression evaluation     :   0.02 (  0%)   0.03 (  1%)   0.07 (  1%)   984k (  0%)
 inline parameters                  :   0.03 (  1%)   0.05 (  2%)   0.00 (  0%)   997k (  0%)
 integration                        :   0.00 (  0%)   0.02 (  1%)   0.04 (  1%)  1699k (  0%)
 tree gimplify                      :   0.04 (  1%)   0.03 (  1%)   0.06 (  1%)  9578k (  2%)
 tree eh                            :   0.02 (  0%)   0.01 (  0%)   0.02 (  0%)  2360k (  1%)
 tree CFG construction              :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)  4430k (  1%)
 tree CFG cleanup                   :   0.04 (  1%)   0.01 (  0%)   0.04 (  1%) 10072  (  0%)
 tree PHI insertion                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)   490k (  0%)
 tree SSA rewrite                   :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  2066k (  1%)
 tree SSA other                     :   0.01 (  0%)   0.02 (  1%)   0.06 (  1%)   379k (  0%)
 tree SSA incremental               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    32k (  0%)
 tree operand scan                  :   0.00 (  0%)   0.04 (  2%)   0.03 (  0%)  4396k (  1%)
 tree FRE                           :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)   170k (  0%)
 tree forward propagate             :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)  5864  (  0%)
 PHI merge                          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  2016  (  0%)
 dominance frontiers                :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)     0  (  0%)
 dominance computation              :   0.02 (  0%)   0.03 (  1%)   0.02 (  0%)     0  (  0%)
 out of ssa                         :   0.04 (  1%)   0.00 (  0%)   0.00 (  0%)   231k (  0%)
 expand vars                        :   0.01 (  0%)   0.01 (  0%)   0.01 (  0%)   968k (  0%)
 expand                             :   0.12 (  3%)   0.02 (  1%)   0.11 (  2%)    13M (  3%)
 post expand cleanups               :   0.01 (  0%)   0.01 (  0%)   0.03 (  0%)  1490k (  0%)
 varconst                           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)  7032  (  0%)
 jump                               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)     0  (  0%)
 forward prop                       :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)  2632  (  0%)
 CSE                                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)  7744  (  0%)
 loop init                          :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  2188k (  1%)
 loop fini                          :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 branch prediction                  :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    62k (  0%)
 combiner                           :   0.01 (  0%)   0.01 (  0%)   0.01 (  0%)   109k (  0%)
 integrated RA                      :   0.24 (  6%)   0.05 (  2%)   0.40 (  6%)    59M ( 15%)
 LRA non-specific                   :   0.10 (  2%)   0.04 (  2%)   0.15 (  2%)   519k (  0%)
 LRA virtuals elimination           :   0.01 (  0%)   0.00 (  0%)   0.03 (  0%)  1179k (  0%)
 LRA reload inheritance             :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  4368  (  0%)
 LRA create live ranges             :   0.03 (  1%)   0.00 (  0%)   0.04 (  1%)    15k (  0%)
 reload                             :   0.02 (  0%)   0.00 (  0%)   0.00 (  0%)    57k (  0%)
 thread pro- & epilogue             :   0.06 (  1%)   0.00 (  0%)   0.04 (  1%)  4127k (  1%)
 shorten branches                   :   0.06 (  1%)   0.00 (  0%)   0.03 (  0%)   240  (  0%)
 reg stack                          :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  3504  (  0%)
 final                              :   0.08 (  2%)   0.03 (  1%)   0.14 (  2%)  5623k (  1%)
 symout                             :   0.22 (  5%)   0.04 (  2%)   0.32 (  5%)    36M (  9%)
 initialize rtl                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    12k (  0%)
 early local passes                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 rest of compilation                :   0.21 (  5%)   0.03 (  1%)   0.26 (  4%)  6528k (  2%)
 unaccounted late compilation       :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 repair loop structures             :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 TOTAL                              :   4.34          2.05          6.38          399M
@tt4g
Copy link
Contributor

tt4g commented Jul 3, 2024

Using the latest version of the fmt library instead of bundled fmt library in spdlog may reduce the compile time.

@zhuzhzh
Copy link
Contributor Author

zhuzhzh commented Jul 3, 2024

I used the external latest fmt library. but I didn't find any improvement.

the first line in example.cpp:

#define SPDLOG_FMT_EXTERNAL

I commented out the user_defined_example() which leads to the compilation error.

Then I recompile the example.

g++ -g example.cpp -I/home/public/fmt/include -L/home/public/fmt/lib64 -lfmt -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB    -I/home/public/spdlog/include -lpthread -o logger
make  3.16s user 0.18s system 97% cpu 3.451 total

@tt4g
Copy link
Contributor

tt4g commented Jul 3, 2024

Then, compile time may not be reduced any further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants