Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling is unecessary slow #438

Open
guibou opened this issue Sep 10, 2018 · 12 comments
Open

Profiling is unecessary slow #438

guibou opened this issue Sep 10, 2018 · 12 comments
Assignees
Labels
P3 minor: not priorized type: bug

Comments

@guibou
Copy link
Contributor

guibou commented Sep 10, 2018

This is an updated description which summarize the discussion

  • Building with -c dbg is really slow when template haskell is used because we use -fexternal-interpreter and this GHC bug: https://ghc.haskell.org/trac/ghc/ticket/16256#ticket . In summary, profiling build is 15x times slower than normal build.
  • There is two solutions to build haskell with profiling if template haskell is involved. -fexternal-interpreter, which is now used in rules_haskell, and another solution which build two times, once without profiling (so template haskell can execute non profiled code) and one with profiling.
  • rules_haskell was initially using the two build solution, but the switch to -fexternal-interpreter was accidentally done in be5cf67#diff-33a8c46955e2a9917090b5258346aee2R205. Since Drop non-profiling inputs to profiling build #892 the dead code of the two build solution was removed.

We have two solution here:

  • Keep the approach with -fexternal-interpreter and wait for an upstream fix for GHC
  • Get back to the approach without -fexternal-interpreter and reimplement the two steps build.
@guibou guibou self-assigned this Sep 10, 2018
@guibou
Copy link
Contributor Author

guibou commented Jan 25, 2019

I'm still not able to generate a small reproducible example, but I now have a lot of question related to the implementation of -prof in rules_haskell.

So, rules_haskell fails building some targets in the following context:

  • TemplateHaskell is activated
  • More than one haskell file in the src package
  • Dependencies to other haskell libraries which have cbit

When experimenting, I found that removing the -fexternal-interpreter flag from

args.add("-prof", "-fexternal-interpreter")
leads to progress, i.e. targets which cannot be built previously can now be built.

I've tried the same change in rules_haskell codebase and the result is surprising. bazel test -c dbg //... fails all tests, but bazel test -c dbg //a_specific_test works for all tests. I'll try to understand that later.

There are two ways to build with profiling using ghc and TemplateHaskell, -fexternal-interpreter and a two phases build where you first build without -prof and then with. This is detailed in https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#using-template-haskell-with-profiling

For example, this experimentation (with only one module, in this case the two phases build is not really necessary, but will be for more modules).

Foo.hs:

{-# LANGUAGE TemplateHaskell #-}

main = do
  print $([| 10 |])

Build without profiling:

[guillaume@paddle:/tmp]$ ghc Foo.hs
[1 of 1] Compiling Main             ( Foo.hs, Foo.o )
Linking Foo ...
[guillaume@paddle:/tmp]$ ./Foo 
10
[guillaume@paddle:/tmp]$ ./Foo +RTS -p
Foo: the flag -p requires the program to be built with -prof
[...]

Failure with just -prof:

[guillaume@paddle:/tmp]$ ghc Foo.hs -prof
[1 of 1] Compiling Main             ( Foo.hs, Foo.o )
Foo.hs:4:9: fatal:
    Cannot load -prof objects when GHC is built with -dynamic
    To fix this, either:
      (1) Use -fexternal-interpreter, or
      (2) Build the program twice: once with -dynamic, and then
          with -prof using -osuf to set a different object file suffix.

(Note how it discusses the two solutions)

First solution with -fexternal-interpreter:

[guillaume@paddle:/tmp]$ ghc Foo.hs -prof -fexternal-interpreter
[1 of 1] Compiling Main             ( Foo.hs, Foo.o )
Linking Foo ...

[guillaume@paddle:/tmp]$ ./Foo +RTS -p
10

(This build command takes 30s on my computer)

Second solution, with the two builds:

[guillaume@paddle:/tmp]$ ghc Foo.hs
[1 of 1] Compiling Main             ( Foo.hs, Foo.o )
Linking Foo ...

[guillaume@paddle:/tmp]$ ghc Foo.hs -prof -osuf curryforpresident
[1 of 1] Compiling Main             ( Foo.hs, Foo.curryforpresident )
Linking Foo ...

[guillaume@paddle:/tmp]$ ./Foo +RTS -p
10

This solution takes 2s to build.

-fexternal-interpreter is discussed here: https://ghc.haskell.org/trac/ghc/wiki/RemoteGHCi

Now I have many discussions points to direct my work. Currently, rules_haskell uses -fexternal-interpreter AND the two build solution at the same time, see:

args.add("-prof", "-fexternal-interpreter")

c = hs.toolchain.actions.compile_binary(
hs,
cc,
java,
dep_info,
srcs = srcs_files,
ls_modules = ctx.executable._ls_modules,
import_dir_map = import_dir_map,
extra_srcs = depset(ctx.files.extra_srcs),
compiler_flags = ctx.attr.compiler_flags,
dynamic = not ctx.attr.linkstatic,
with_profiling = False,
main_function = ctx.attr.main_function,
version = ctx.attr.version,
)
c_p = None
if with_profiling:
c_p = hs.toolchain.actions.compile_binary(
hs,
cc,
java,
dep_info,
srcs = srcs_files,
ls_modules = ctx.executable._ls_modules,
import_dir_map = import_dir_map,
# NOTE We must make the object files compiled without profiling
# available to this step for TH to work, presumably because GHC is
# linked against RTS without profiling.
extra_srcs = depset(transitive = [
depset(ctx.files.extra_srcs),
depset([c.objects_dir]),
]),
compiler_flags = ctx.attr.compiler_flags,
# NOTE We can't have profiling and dynamic code at the
# same time, see:
# https://ghc.haskell.org/trac/ghc/ticket/15394
dynamic = False,
with_profiling = True,
main_function = ctx.attr.main_function,
version = ctx.attr.version,

The -fexternal-interpreter flag was added in the middle of a pull request which seems totally unrelated:

be5cf67#diff-33a8c46955e2a9917090b5258346aee2R205

@mrkkrp could you comment please?

There is two possible directions:

  • remove -fexternal-interpreter and fix the two steps build.
  • remove the two steps build and fix the -fexternal-interpreter

Both solutions have pro and cons, mostly:

  • -fexternal-interpreter is slow! 30s for a one line file with TemplateHaskell. However I don't know if that's a fixed cost which won't grow with the source code complexity or if it is not.
  • The two step build is building two times, so may be slow for complex builds
  • -fexternal-interpreter is listed as more "secure" and versatile on its design page AND may lead to a simpler implementation in rules haskell.
  • -fexternal-interpreter is new (ghc 8.0) and not broadly tested because that's not the default solution, but it "make become in the future".

cc @mboes

@guibou
Copy link
Contributor Author

guibou commented Jan 25, 2019

#338 is related because @mrkkrp proposes to uses -fexternal-interpreter in rules_haskell. But that's already the case apparently.

@guibou guibou changed the title Haskell profiling does not link Profiling is unecessary slow Jan 25, 2019
@guibou
Copy link
Contributor Author

guibou commented Jan 25, 2019

Note that cabal uses the "two step" build solution.

I changed title's issue to reflect the current discussion. I should have created this discussion in a new issue.

@guibou
Copy link
Contributor Author

guibou commented Jan 30, 2019

Progress notes:

-fexternal-interpreter approach

Keeping -fexternal-interpreter is slow and leads to a lot of error. See the following bug report I've opened:

I'm able to build a client codebase with -c dbg, -fexternal-interpreter and a lot of custom hacks. Most of the libraries used by the client needs to be added to LD_PRELOAD before calling GHC. A few of the client executables built with -c dbg and -fexternal-interpreter are failing with a segmentation fault inside the stg_noDuplicatezh symbol. I don't know yet if that's due to the building process or if the client code wrong in any way. But that code works well without -c dbg.

I'm waiting a bit for activities on the upstream bug reports, but I'm tempted to drop -fexternal-interpreter support for now.

removing -fexternal-interpreter

I'm trying to remove -fexternal-interpreter and fix the two step build process. However it fails during the second build step when ghc is looking for objects file built during the first step.

  • During the first step, we are writing all objects in a obj/target_name directory using the -odir argument.
  • During the second step, we are writing all objects in an obj_prof/target_name directory using the -odir argument. We are also using the -osuf argument to set object extension to o_p.

During the second phase, ghc looks for o_dyn object files for its template haskell phase. However it look at them in the directory specified by -odir in the second phase, so the obj_prof/target_name directory, when they are in the obj/target_name directory. This leads to build time error similar to:

cannot find object file obj_prof/target_name/ModuleName.o_dyn

(This message is hand crafted, but it looks similar to that). Observe the o_dyn and obj_prof.

How to fix that, a few solutions:

  • Find if ghc does not have a flag similar to -odir but only for the dynamic objects of the template haskell phase. I cannot find this flag. I've asked a GHC developer yesterday about this on IRC and he told me that they are using something similar in tests, but cannot find it finally.
  • Put all objects file in the same directory. This is the simplest solution, but I don't know how to do that with bazel because directories are declared with ctx.actions.declare_directory for both the first and second build phase and bazel complains that one directory cannot be output of two distinct actions.
  • Avoid declare_directory and replace it by declare_file and declare a file for each object file generated by the build process. This solution is complex because we don't easily know the list of generated file beforehand. I think all the srcs arguments with hs, lhs, hsc generate an equivalent .o object file. I'm now going this way.

@mboes
Copy link
Member

mboes commented Jan 30, 2019

Nice writeup, but I'm failing to understand what the objective of this ticket is. The title of the ticket says profiling is "slow". But the description says there's some build error somewhere (but no repro). Which problem are you trying to solve?

@guibou
Copy link
Contributor Author

guibou commented Jan 30, 2019

@mboes

  • I've opened this ticket 4 months ago when I was unable to build using -c dbg on my client codebase.
  • I started documenting my recent work and observed things about -fexternal-interpreter and the fact that we are building two times, this lead to slow build time (-fexternal-interpreter or the double build, one of them is unnecessary).
  • I realized that this issue had more content / discussion about the refactoring of profiling build to reduce build time, so I renamed the ticket instead of creating a new one pointing to that one.

What I'm trying to solve:

  • Refactor the profiling build code to reduce the build time
  • Fix the bug in my codebase. Most of the issues are due to an upstream GHC bug which seems to only appears with nix, see https://ghc.haskell.org/trac/ghc/ticket/16257#ticket . Once that bug is fixed, I'll be able to build my client codebase with -c dbg. However it will still be "slow".
  • "slow" means impressively slow. I showed evidences in the upstream ticket https://ghc.haskell.org/trac/ghc/ticket/16256#ticket where a one liner haskell file takes 15x more time when built with -fexternal-interpreter. On my client codebase this means that I have some haskell module which takes a few hours to build with -c dbg.

Considering that the GHC linking bug won't be fixed soon (in a released GHC version), the slowdown of -fexternal-interpreter and the fact that this may impact all users of rules_haskell, I'm convinced that the solution is to fix the profiling build with -fexternal-interpreter`. Especially when I think that this option was added by mistake in the codebase.

@mboes
Copy link
Member

mboes commented May 24, 2019

There's a lot of good discussion in the comments here, but they read as a journal of incremental discoveries, so it's hard to follow. Meanwhile the ticket description is still... very minimal. @guibou could you summarize your findings in an updated ticket description?

@Profpatsch
Copy link
Contributor

ping @guibou

@guibou
Copy link
Contributor Author

guibou commented Oct 7, 2019

I've updated the ticket description and I'm closing as won't fix.

Rational: there is two way to fix this issue:

@guibou
Copy link
Contributor Author

guibou commented May 8, 2020

Actually, I'm reopening it. Profiling is still slow, even if we did choose to keep the slow implementation, the problem is still there.

@guibou
Copy link
Contributor Author

guibou commented May 12, 2020

There is no more details in the upstream ticket, but the problem seems to be fixed starting ghc 8.8. https://gitlab.haskell.org/ghc/ghc/issues/16256

Now the issue is linked to #1323 .

@guibou
Copy link
Contributor Author

guibou commented May 15, 2020

https://gitlab.haskell.org/ghc/ghc/issues/14335 ghc plugins are not working with -fexternal-interpreter is another argument against using external-interpreter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 minor: not priorized type: bug
Projects
None yet
Development

No branches or pull requests

4 participants