Skip to content
This repository has been archived by the owner on Jan 2, 2021. It is now read-only.

Opentelemetry traces and heapsize memory analysis #922

Merged
merged 63 commits into from
Dec 5, 2020

Conversation

pepeiborra
Copy link
Collaborator

@pepeiborra pepeiborra commented Nov 25, 2020

This PR builds on the work of @mpardalos and supersedes #826

A new facility to log open telemetry events and spans to the GHC event log is added. The following are logged:

  • LSP events (all of them)
  • Shake action spans

Additionally, when the flag --ot-memory-profiling is given, the following are logged periodically:

  • Count of items in the ShakeExtras "state" (Metric is called "values map size_bytes")
  • Aggregate size of items per key, with sharing, so that shared costs are not reported twice.

For the non LSP driver, the flag --ot-memory-profiling produces a textual version of the above, instead of an eventlog.

There is documentation on how to visualise the eventlogs using Tracy. The Nix script takes care of installing all the dependencies

TODOs:

Future TODOs (not for this PR):

  • Use spans to link LSP events and Shake actions, as well as Shake dependencies
  • --ot-memory-profiling should be always enabled and log only on SIGHUP(or some other signal) instead of periodically
  • Output mechanism for traces should be configurable, so that we can log to stdout, the cloud, etc.

mpardalos and others added 30 commits September 22, 2020 19:58
Instead of doing it in `HoverDefinition`, do it in
with{Response,Notification,...}. These wrap all handlers, so this should
cover everything. It also means that the span covers the entire
processing time for the request, where before we missed the setup
happening in the with* functions.
Run GC regularly with --ot-profiling
I renamed the fork to distringuish from the original.
It is still being pulled from git using stack. This will be addressed
once I can push the fork to hackage.
to avoid circular module dependencies
samuela and others added 4 commits November 29, 2020 23:09
When using ghcide as a library, it may be desirable to host the hie.yaml file
in a location other than the project root, or even avoid the file system altogether
* Favor `lookupPathToId` over `pathToId`

* Fix `typecheckParentsAction`

* Fix `needsCompilationRule`
* Use the real client capabilities on completions

* Return completion snippets only when supported by the client

Restored from haskell#900
@jneira
Copy link
Member

jneira commented Nov 30, 2020

@pepeiborra will try to take a look asap, but the error does not look good, we could disable temporaly the windows build if you want to go forward and activate it afterwards.
In hls there is no stack build for windows so will no suffer it.

# > To load a package foo, GHCi can load its libHSfoo.a library directly, but it can also load a package in the form of a single HSfoo.o file that has been pre-linked. Loading the .o file is slightly quicker, but at the expense of having another copy of the compiled package. The rule of thumb is that if the modules of the package were compiled with -split-sections then building the HSfoo.o is worthwhile because it saves time when loading the package into GHCi. Without -split-sections, there is not much difference in load time between the .o and .a libraries, so it is better to save the disk space and only keep the .a around. In a GHC distribution we provide .o files for most packages except the GHC package itself.
# > The HSfoo.o file is built by Cabal automatically; use --disable-library-for-ghci to disable it. To build one manually, the following GNU ld command can be used:

# > ld -r --whole-archive -o HSfoo.o libHSfoo.a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG 🤦

Thanks for the detailed report of the root cause

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the fix is not enough. There is another stack issue after it.

@pepeiborra
Copy link
Collaborator Author

@jneira @ndmitchell unless one of you can get to the bottom of the problem here, I will drop the Stack Windows build as we lack the maintainer power to keep it working.

@jneira
Copy link
Member

jneira commented Nov 30, 2020

The error seems to be of the same kind:

ghcide> ghc.exe:  | D:\a\1\s\.stack-work\dist\a3a5fe88\build\HSghcide-0.5.0-GfeKdUCOB448C8Yqgu22oa.o: unknown symbol `heapsizzezm0zi3zi0zmG6oW1FfbuDgJS3obQNcUGF_HeapSizze_runHeapsizze1_closure'

Reproduced in local

pepeiborra added a commit to pepeiborra/ghcide that referenced this pull request Dec 3, 2020
The Stack Windows build is problematic:
  haskell#922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
@pepeiborra pepeiborra mentioned this pull request Dec 3, 2020
@pepeiborra
Copy link
Collaborator Author

#934 opened to unblock this PR

pepeiborra added a commit to pepeiborra/ghcide that referenced this pull request Dec 5, 2020
The Stack Windows build is problematic:
  haskell#922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
pepeiborra added a commit that referenced this pull request Dec 5, 2020
The Stack Windows build is problematic:
  #922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
@pepeiborra pepeiborra merged commit e24a744 into haskell:master Dec 5, 2020
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
The Stack Windows build is problematic:
  haskell/ghcide#922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* Move tracing functions to own module

* Bump opentelemetry to 0.6.0

* Write Values map size to OpenTelemetry metric

* Trace all requests and notifications

Instead of doing it in `HoverDefinition`, do it in
with{Response,Notification,...}. These wrap all handlers, so this should
cover everything. It also means that the span covers the entire
processing time for the request, where before we missed the setup
happening in the with* functions.

* Add flag for OpenTelemetry profiling

Run GC regularly with --ot-profiling

* Add flag to enable OT profiling in benchmark

* Use heapsize instead of ghc-datasize

I renamed the fork to distringuish from the original.
It is still being pulled from git using stack. This will be addressed
once I can push the fork to hackage.

* Bump opentelemetry to 0.6.1 - fixes 8.6 build

* Use heapsize from hackage

* Address HLint messages

* Record size of each key independently

* Refactor `startTelemetry` function

* Remove delay between measuring memory loops

* Each key in values map gets own OT instrument

* Measure values map length more rarely

* Rename --ot-profiling to --ot-memory-profiling

* Add docs for how to use the opentelemetry output

* Add instructions to build release version of tracy

* Clarify dependencies in opentelemetry instructions

* Fix LSP traces

* otTraced: delete unused

* Extract types out of D.IDE.Core.Shake

to avoid circular module dependencies

* Extract startTelemetry out of D.IDE.Shake and upgrade to 0.2

No more segfaults

* [nix] install opentelemetry

* [nix] install tracy

* Fix merge wibble

* Measure recursive sizes with sharing

* Sort keys for cost attribution

* Remove debug traces

* Allocate less, group keys, clean up hlints

* Add -A4G to the flags used for --ot-memory-profiling

* Modularize D.IDE.Core.Tracing

I want to reuse this code more directly in the non lsp driver

* Direct driver: report closure sizes when --ot-memory-profiling

An eventlog memory analysis doesnt' seem so relevant since this mode is not
interactive, but we could easily produce both if wanted to

* Everything is reachable from GhcSessionIO, so compute it last

I suspect the ShakeExtras record is reachable from GhcSessionIO

* bound recursion and use logger

* hlint suggestions

* Fix 8.6 build

* Format imports

* Do the memory analysis with full sharing. GhcSessionIO last

* Fail fast in the memory analysis

* error handling

* runHeapsize now takes initSize as an input argument

* Trace Shake sessions

* Reduced frequency for sampling values length

* Drop the -fexternal-interpreter flag in the Windows stack build

* Produce more benchmark artifacts

* Fix stack descriptors to use heapsize-0.2 from Hackage

* Bump to heapsize-0.3.0

* Record completions snippets (haskell/ghcide#900)

* Add field for RecordSnippets to CachcedCompletion

* Initial version of local record snippets

* Supprt record snippet completion for non local declarations.

* Better integration of local completions with current implementation

* Clean up non-local completions.

* Remove commented code.

* Switch from String to Text

* Remove ununsed definition

* Treat only Records and leave other defintions as is.

* Differentiate Records from Data constructors for external declaration

* Update test to include snippet in local record completions expected list.

* Update completionTest to also compare insertText.

* Add test for record snippet completion for imported records.

* Hlint fixes

* Hlint fixes

* Hlint suggestions.

* Update type.

* Consolidate imports

* Unpack tuple with explicit names

* Idiomatic changes

* Remove unused variable

* Better variable name

* Hlint suggestions

* Handle exhaustive pattern warning

* Add _ to snippet field name suggestions

* Remove type information passed around but not used

* Update to list comprehension style

* Eliminate intermediate function

* HLint suggestions.

* Idiomatic list comprehension

Co-authored-by: Pepe Iborra <[email protected]>

* [nix] use gitignore.nix (haskell/ghcide#920)

* Ignore import list while producing completions (haskell/ghcide#919)

* Drop any items in explicit import list

* Test if imports not included in explicit list show up in completions

* Update README.md (haskell/ghcide#924)

* Custom cradle loading (haskell/ghcide#928)

When using ghcide as a library, it may be desirable to host the hie.yaml file
in a location other than the project root, or even avoid the file system altogether

* Favor `lookupPathToId` over `pathToId` (haskell/ghcide#926)

* Favor `lookupPathToId` over `pathToId`

* Fix `typecheckParentsAction`

* Fix `needsCompilationRule`

* Return completion snippets only when client supports it (haskell/ghcide#929)

* Use the real client capabilities on completions

* Return completion snippets only when supported by the client

Restored from haskell/ghcide#900

* Redundant import

* Fix stack windows build

Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Guru Devanla <[email protected]>
Co-authored-by: Samuel Ainsworth <[email protected]>
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
The Stack Windows build is problematic:
  haskell/ghcide#922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* Move tracing functions to own module

* Bump opentelemetry to 0.6.0

* Write Values map size to OpenTelemetry metric

* Trace all requests and notifications

Instead of doing it in `HoverDefinition`, do it in
with{Response,Notification,...}. These wrap all handlers, so this should
cover everything. It also means that the span covers the entire
processing time for the request, where before we missed the setup
happening in the with* functions.

* Add flag for OpenTelemetry profiling

Run GC regularly with --ot-profiling

* Add flag to enable OT profiling in benchmark

* Use heapsize instead of ghc-datasize

I renamed the fork to distringuish from the original.
It is still being pulled from git using stack. This will be addressed
once I can push the fork to hackage.

* Bump opentelemetry to 0.6.1 - fixes 8.6 build

* Use heapsize from hackage

* Address HLint messages

* Record size of each key independently

* Refactor `startTelemetry` function

* Remove delay between measuring memory loops

* Each key in values map gets own OT instrument

* Measure values map length more rarely

* Rename --ot-profiling to --ot-memory-profiling

* Add docs for how to use the opentelemetry output

* Add instructions to build release version of tracy

* Clarify dependencies in opentelemetry instructions

* Fix LSP traces

* otTraced: delete unused

* Extract types out of D.IDE.Core.Shake

to avoid circular module dependencies

* Extract startTelemetry out of D.IDE.Shake and upgrade to 0.2

No more segfaults

* [nix] install opentelemetry

* [nix] install tracy

* Fix merge wibble

* Measure recursive sizes with sharing

* Sort keys for cost attribution

* Remove debug traces

* Allocate less, group keys, clean up hlints

* Add -A4G to the flags used for --ot-memory-profiling

* Modularize D.IDE.Core.Tracing

I want to reuse this code more directly in the non lsp driver

* Direct driver: report closure sizes when --ot-memory-profiling

An eventlog memory analysis doesnt' seem so relevant since this mode is not
interactive, but we could easily produce both if wanted to

* Everything is reachable from GhcSessionIO, so compute it last

I suspect the ShakeExtras record is reachable from GhcSessionIO

* bound recursion and use logger

* hlint suggestions

* Fix 8.6 build

* Format imports

* Do the memory analysis with full sharing. GhcSessionIO last

* Fail fast in the memory analysis

* error handling

* runHeapsize now takes initSize as an input argument

* Trace Shake sessions

* Reduced frequency for sampling values length

* Drop the -fexternal-interpreter flag in the Windows stack build

* Produce more benchmark artifacts

* Fix stack descriptors to use heapsize-0.2 from Hackage

* Bump to heapsize-0.3.0

* Record completions snippets (haskell/ghcide#900)

* Add field for RecordSnippets to CachcedCompletion

* Initial version of local record snippets

* Supprt record snippet completion for non local declarations.

* Better integration of local completions with current implementation

* Clean up non-local completions.

* Remove commented code.

* Switch from String to Text

* Remove ununsed definition

* Treat only Records and leave other defintions as is.

* Differentiate Records from Data constructors for external declaration

* Update test to include snippet in local record completions expected list.

* Update completionTest to also compare insertText.

* Add test for record snippet completion for imported records.

* Hlint fixes

* Hlint fixes

* Hlint suggestions.

* Update type.

* Consolidate imports

* Unpack tuple with explicit names

* Idiomatic changes

* Remove unused variable

* Better variable name

* Hlint suggestions

* Handle exhaustive pattern warning

* Add _ to snippet field name suggestions

* Remove type information passed around but not used

* Update to list comprehension style

* Eliminate intermediate function

* HLint suggestions.

* Idiomatic list comprehension

Co-authored-by: Pepe Iborra <[email protected]>

* [nix] use gitignore.nix (haskell/ghcide#920)

* Ignore import list while producing completions (haskell/ghcide#919)

* Drop any items in explicit import list

* Test if imports not included in explicit list show up in completions

* Update README.md (haskell/ghcide#924)

* Custom cradle loading (haskell/ghcide#928)

When using ghcide as a library, it may be desirable to host the hie.yaml file
in a location other than the project root, or even avoid the file system altogether

* Favor `lookupPathToId` over `pathToId` (haskell/ghcide#926)

* Favor `lookupPathToId` over `pathToId`

* Fix `typecheckParentsAction`

* Fix `needsCompilationRule`

* Return completion snippets only when client supports it (haskell/ghcide#929)

* Use the real client capabilities on completions

* Return completion snippets only when supported by the client

Restored from haskell/ghcide#900

* Redundant import

* Fix stack windows build

Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Guru Devanla <[email protected]>
Co-authored-by: Samuel Ainsworth <[email protected]>
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
The Stack Windows build is problematic:
  haskell/ghcide#922

Stack is already covered by the Azure CI
Windows is already covered by the Github Actions CI
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* Move tracing functions to own module

* Bump opentelemetry to 0.6.0

* Write Values map size to OpenTelemetry metric

* Trace all requests and notifications

Instead of doing it in `HoverDefinition`, do it in
with{Response,Notification,...}. These wrap all handlers, so this should
cover everything. It also means that the span covers the entire
processing time for the request, where before we missed the setup
happening in the with* functions.

* Add flag for OpenTelemetry profiling

Run GC regularly with --ot-profiling

* Add flag to enable OT profiling in benchmark

* Use heapsize instead of ghc-datasize

I renamed the fork to distringuish from the original.
It is still being pulled from git using stack. This will be addressed
once I can push the fork to hackage.

* Bump opentelemetry to 0.6.1 - fixes 8.6 build

* Use heapsize from hackage

* Address HLint messages

* Record size of each key independently

* Refactor `startTelemetry` function

* Remove delay between measuring memory loops

* Each key in values map gets own OT instrument

* Measure values map length more rarely

* Rename --ot-profiling to --ot-memory-profiling

* Add docs for how to use the opentelemetry output

* Add instructions to build release version of tracy

* Clarify dependencies in opentelemetry instructions

* Fix LSP traces

* otTraced: delete unused

* Extract types out of D.IDE.Core.Shake

to avoid circular module dependencies

* Extract startTelemetry out of D.IDE.Shake and upgrade to 0.2

No more segfaults

* [nix] install opentelemetry

* [nix] install tracy

* Fix merge wibble

* Measure recursive sizes with sharing

* Sort keys for cost attribution

* Remove debug traces

* Allocate less, group keys, clean up hlints

* Add -A4G to the flags used for --ot-memory-profiling

* Modularize D.IDE.Core.Tracing

I want to reuse this code more directly in the non lsp driver

* Direct driver: report closure sizes when --ot-memory-profiling

An eventlog memory analysis doesnt' seem so relevant since this mode is not
interactive, but we could easily produce both if wanted to

* Everything is reachable from GhcSessionIO, so compute it last

I suspect the ShakeExtras record is reachable from GhcSessionIO

* bound recursion and use logger

* hlint suggestions

* Fix 8.6 build

* Format imports

* Do the memory analysis with full sharing. GhcSessionIO last

* Fail fast in the memory analysis

* error handling

* runHeapsize now takes initSize as an input argument

* Trace Shake sessions

* Reduced frequency for sampling values length

* Drop the -fexternal-interpreter flag in the Windows stack build

* Produce more benchmark artifacts

* Fix stack descriptors to use heapsize-0.2 from Hackage

* Bump to heapsize-0.3.0

* Record completions snippets (haskell/ghcide#900)

* Add field for RecordSnippets to CachcedCompletion

* Initial version of local record snippets

* Supprt record snippet completion for non local declarations.

* Better integration of local completions with current implementation

* Clean up non-local completions.

* Remove commented code.

* Switch from String to Text

* Remove ununsed definition

* Treat only Records and leave other defintions as is.

* Differentiate Records from Data constructors for external declaration

* Update test to include snippet in local record completions expected list.

* Update completionTest to also compare insertText.

* Add test for record snippet completion for imported records.

* Hlint fixes

* Hlint fixes

* Hlint suggestions.

* Update type.

* Consolidate imports

* Unpack tuple with explicit names

* Idiomatic changes

* Remove unused variable

* Better variable name

* Hlint suggestions

* Handle exhaustive pattern warning

* Add _ to snippet field name suggestions

* Remove type information passed around but not used

* Update to list comprehension style

* Eliminate intermediate function

* HLint suggestions.

* Idiomatic list comprehension

Co-authored-by: Pepe Iborra <[email protected]>

* [nix] use gitignore.nix (haskell/ghcide#920)

* Ignore import list while producing completions (haskell/ghcide#919)

* Drop any items in explicit import list

* Test if imports not included in explicit list show up in completions

* Update README.md (haskell/ghcide#924)

* Custom cradle loading (haskell/ghcide#928)

When using ghcide as a library, it may be desirable to host the hie.yaml file
in a location other than the project root, or even avoid the file system altogether

* Favor `lookupPathToId` over `pathToId` (haskell/ghcide#926)

* Favor `lookupPathToId` over `pathToId`

* Fix `typecheckParentsAction`

* Fix `needsCompilationRule`

* Return completion snippets only when client supports it (haskell/ghcide#929)

* Use the real client capabilities on completions

* Return completion snippets only when supported by the client

Restored from haskell/ghcide#900

* Redundant import

* Fix stack windows build

Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Michalis Pardalos <[email protected]>
Co-authored-by: Guru Devanla <[email protected]>
Co-authored-by: Samuel Ainsworth <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants