Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mi and save/load support of pid_tracker #8

Closed
wants to merge 0 commits into from

Conversation

PSRCode
Copy link
Contributor

@PSRCode PSRCode commented Apr 1, 2015

@jgalar @compudj
These commits aims to provide support for new feature pid_tracker.

On the mi side this pr offer support for track and untrack command. It also add pid_tracker information to the list command under the domain node.

As for the save/load of sessions we save a pid_tracker node under the domain node.

@PSRCode PSRCode closed this Apr 24, 2015
jgalar added a commit that referenced this pull request Oct 30, 2019
The following crash was reported when short-lived applications
are traced in a live session with per-pid buffering channels.

From the original report:

```
 Thread 1 (Thread 0x7f72b67fc700 (LWP 1912155)):
 #0  0x00005650b3f6ccbd in commit_one_metadata_packet (stream=0x7f729c010bf0) at ust-consumer.c:2537
 #1  0x00005650b3f6cf58 in lttng_ustconsumer_sync_metadata (ctx=0x5650b588ce60, metadata=0x7f729c010bf0) at ust-consumer.c:2608
 #2  0x00005650b3f4dba3 in do_sync_metadata (metadata=0x7f729c010bf0, ctx=0x5650b588ce60) at consumer-stream.c:471
 #3  0x00005650b3f4dd3c in consumer_stream_sync_metadata (ctx=0x5650b588ce60, session_id=0) at consumer-stream.c:548
 #4  0x00005650b3f6de78 in lttng_ustconsumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at ust-consumer.c:2917
 #5  0x00005650b3f45196 in lttng_consumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at consumer.c:3524
 #6  0x00005650b3f42da7 in consumer_thread_data_poll (data=0x5650b588ce60) at consumer.c:2894
 #7  0x00007f72bdc476db in start_thread (arg=0x7f72b67fc700) at pthread_create.c:463
 #8  0x00007f72bd97088f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The segfault happen on the access to 'stream->chan->metadata_cache->lock'
chan value here is zero.
```

The problem is easily reproducible if a sleep(1) is added just after
the call to lttng_ustconsumer_request_metadata(), before the metadata
stream lock is re-acquired.

During the execution of the "request_metadata", an application can
close. This will cause the session daemon to push any remaining
metadata to the consumer daemon and to close the metadata channel.

Closing the metadata channel closes the metadata stream's wait_fd,
which is an internal pipe. The closure of the metadata pipe is
detected by the metadata_poll thread, which will ensure that all
metadata has been consumed before issuing the deletion of the metadata
stream and channel.

During the deletion, the channel's "stream" attribute the stream's
"chan" attribute are set to NULL as both are logically deleted and
should not longer be used.

Meanwhile, the thread executing commit_one_metadata_packet()
re-acquires the metadata stream lock and trips on the now-NULL "chan"
member.

The fix consists in checking if the metadata stream is logically
deleted after its lock is re-acquired. It is correct for the
sync_metadata operation to then complete successfully as the metadata
is synced: the metadata guarantees this before deleting the
stream/channel.

Since the metadata stream's lifetime is protected by its lock, there
may be other sites that need such a check. The lock and deletion check
could be combined into a single consumer_stream_lock() helper in
follow-up fixes.

Reported-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
jgalar added a commit that referenced this pull request Nov 5, 2019
The following crash was reported when short-lived applications
are traced in a live session with per-pid buffering channels.

From the original report:

```
 Thread 1 (Thread 0x7f72b67fc700 (LWP 1912155)):
 #0  0x00005650b3f6ccbd in commit_one_metadata_packet (stream=0x7f729c010bf0) at ust-consumer.c:2537
 #1  0x00005650b3f6cf58 in lttng_ustconsumer_sync_metadata (ctx=0x5650b588ce60, metadata=0x7f729c010bf0) at ust-consumer.c:2608
 #2  0x00005650b3f4dba3 in do_sync_metadata (metadata=0x7f729c010bf0, ctx=0x5650b588ce60) at consumer-stream.c:471
 #3  0x00005650b3f4dd3c in consumer_stream_sync_metadata (ctx=0x5650b588ce60, session_id=0) at consumer-stream.c:548
 #4  0x00005650b3f6de78 in lttng_ustconsumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at ust-consumer.c:2917
 #5  0x00005650b3f45196 in lttng_consumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at consumer.c:3524
 #6  0x00005650b3f42da7 in consumer_thread_data_poll (data=0x5650b588ce60) at consumer.c:2894
 #7  0x00007f72bdc476db in start_thread (arg=0x7f72b67fc700) at pthread_create.c:463
 #8  0x00007f72bd97088f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The segfault happen on the access to 'stream->chan->metadata_cache->lock'
chan value here is zero.
```

The problem is easily reproducible if a sleep(1) is added just after
the call to lttng_ustconsumer_request_metadata(), before the metadata
stream lock is re-acquired.

During the execution of the "request_metadata", an application can
close. This will cause the session daemon to push any remaining
metadata to the consumer daemon and to close the metadata channel.

Closing the metadata channel closes the metadata stream's wait_fd,
which is an internal pipe. The closure of the metadata pipe is
detected by the metadata_poll thread, which will ensure that all
metadata has been consumed before issuing the deletion of the metadata
stream and channel.

During the deletion, the channel's "stream" attribute the stream's
"chan" attribute are set to NULL as both are logically deleted and
should not longer be used.

Meanwhile, the thread executing commit_one_metadata_packet()
re-acquires the metadata stream lock and trips on the now-NULL "chan"
member.

The fix consists in checking if the metadata stream is logically
deleted after its lock is re-acquired. It is correct for the
sync_metadata operation to then complete successfully as the metadata
is synced: the metadata guarantees this before deleting the
stream/channel.

Since the metadata stream's lifetime is protected by its lock, there
may be other sites that need such a check. The lock and deletion check
could be combined into a single consumer_stream_lock() helper in
follow-up fixes.

Reported-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
jgalar added a commit that referenced this pull request Apr 3, 2020
Observed issue
==============

A NULL pointer dereference occurs during the creation of
a session that is associated with a peer older than 2.11.

The resulting backtrace follows:

 Program terminated with signal SIGSEGV, Segmentation fault.

 #0  0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
 1033		if (chunk->path[0] != '\0') {
 [Current thread is 1 (Thread 0x7f8cb808d700 (LWP 7300))]

 #0  0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
 #1  0x0000564af45a6a78 in session_set_anonymous_chunk (session=0x7f8ca8001380) at session.c:229
 #2  session_create (session_name=<optimized out>, hostname=<optimized out>, base_path=<optimized out>, live_timer=<optimized out>, snapshot=<optimized out>,
     sessiond_uuid=<optimized out>, id_sessiond=<optimized out>, current_chunk_id=<optimized out>, creation_time=<optimized out>, major=<optimized out>,
     minor=<optimized out>, session_name_contains_creation_time=<optimized out>) at session.c:416
 #3  0x0000564af459207e in relay_create_session (conn=0x7f8ca0000f60, payload=<optimized out>, recv_hdr=<optimized out>) at main.c:1428
 #4  0x0000564af4594f12 in relay_process_control_command (payload=0x7f8cb808c940, header=0x7f8ca0001000, conn=0x7f8ca0000f60) at main.c:3218
 #5  relay_process_control_receive_payload (conn=0x7f8ca0000f60) at main.c:3361
 #6  0x0000564af45980b0 in relay_process_control (conn=0x7f8ca0000f60) at main.c:3478
 #7  relay_thread_worker (data=<optimized out>) at main.c:3927
 #8  0x00007f8cbba9a46f in start_thread () from /usr/lib/libpthread.so.0
 #9  0x00007f8cbb9ca3d3 in clone () from /usr/lib/libc.so.6

Cause
=====

lttng_trace_chunk_set_as_owner() correctly handles the case
where a trace chunk has no output path, but expects the path
to be an empty string rather than being NULL.

This is not correct as an anonymous chunk, created in backward
compatibility mode when interacting with older peers, has no
path; the path is transmitted as part of the streams' attributes
upon their creation.

Solution
========

Simply check for a NULL pointer in the same place where the empty
chunk path string is created. The rest of the code in trace-chunk.c
doesn't assume that the chunk's path is non-NULL.

Note
====

The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Iaeb41e1648d61fbbe78d70b21191fd6d720900df
jgalar added a commit that referenced this pull request Apr 3, 2020
Observed issue
==============

A NULL pointer dereference occurs during the creation of
a session that is associated with a peer older than 2.11.

The resulting backtrace follows:

 Program terminated with signal SIGSEGV, Segmentation fault.

 #0  0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
 1033		if (chunk->path[0] != '\0') {
 [Current thread is 1 (Thread 0x7f8cb808d700 (LWP 7300))]

 #0  0x0000564af45b755b in lttng_trace_chunk_set_as_owner (chunk=0x7f8ca8004730, session_output_directory=0x7f8ca8004680) at trace-chunk.c:1033
 #1  0x0000564af45a6a78 in session_set_anonymous_chunk (session=0x7f8ca8001380) at session.c:229
 #2  session_create (session_name=<optimized out>, hostname=<optimized out>, base_path=<optimized out>, live_timer=<optimized out>, snapshot=<optimized out>,
     sessiond_uuid=<optimized out>, id_sessiond=<optimized out>, current_chunk_id=<optimized out>, creation_time=<optimized out>, major=<optimized out>,
     minor=<optimized out>, session_name_contains_creation_time=<optimized out>) at session.c:416
 #3  0x0000564af459207e in relay_create_session (conn=0x7f8ca0000f60, payload=<optimized out>, recv_hdr=<optimized out>) at main.c:1428
 #4  0x0000564af4594f12 in relay_process_control_command (payload=0x7f8cb808c940, header=0x7f8ca0001000, conn=0x7f8ca0000f60) at main.c:3218
 #5  relay_process_control_receive_payload (conn=0x7f8ca0000f60) at main.c:3361
 #6  0x0000564af45980b0 in relay_process_control (conn=0x7f8ca0000f60) at main.c:3478
 #7  relay_thread_worker (data=<optimized out>) at main.c:3927
 #8  0x00007f8cbba9a46f in start_thread () from /usr/lib/libpthread.so.0
 #9  0x00007f8cbb9ca3d3 in clone () from /usr/lib/libc.so.6

Cause
=====

lttng_trace_chunk_set_as_owner() correctly handles the case
where a trace chunk has no output path, but expects the path
to be an empty string rather than being NULL.

This is not correct as an anonymous chunk, created in backward
compatibility mode when interacting with older peers, has no
path; the path is transmitted as part of the streams' attributes
upon their creation.

Solution
========

Simply check for a NULL pointer in the same place where the empty
chunk path string is created. The rest of the code in trace-chunk.c
doesn't assume that the chunk's path is non-NULL.

Note
====

The problem was introduced during the 2.12 release cycle (clear
feature); this doesn't need to be backported.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Iaeb41e1648d61fbbe78d70b21191fd6d720900df
jgalar added a commit that referenced this pull request Apr 22, 2020
Observed issue
--------------

While running the out-of-tree java agent tests [1], the session daemon
and agent often end up in a deadlock.

Attaching gdb to the session daemon, we can see that two threads are
blocked in an intriguing state.

Thread 13 (Thread 0x7f89027fc700 (LWP 9636)):
 #0  0x00007f891e81a4cf in __lll_lock_wait () from /usr/lib/libpthread.so.0
 #1  0x00007f891e812e03 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
 #2  0x000055637f1fbd92 in session_lock_list () at session.c:156
 #3  0x000055637f25dc47 in update_agent_app (app=0x7f88ec003480) at agent-thread.c:56
 #4  0x000055637f25ec0a in thread_agent_management (data=0x556380cd2400) at agent-thread.c:426
 #5  0x000055637f22fb3a in launch_thread (data=0x556380cd24a0) at thread.c:65
 #6  0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #7  0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

Thread 8 (Thread 0x7f8919309700 (LWP 9631)):
 #0  0x00007f891e81b44d in recvmsg () from /usr/lib/libpthread.so.0
 #1  0x000055637f267847 in lttcomm_recvmsg_inet_sock (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, len=4, flags=0) at inet.c:367
 #2  0x000055637f2146c6 in recv_reply (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, size=4) at agent.c:275
 #3  0x000055637f215202 in app_context_op (app=0x7f88ec003400, ctx=0x7f8908020900, cmd=AGENT_CMD_APP_CTX_DISABLE) at agent.c:552
 #4  0x000055637f215c2d in disable_context (ctx=0x7f8908020900, domain=LTTNG_DOMAIN_JUL) at agent.c:841
 #5  0x000055637f217480 in agent_destroy (agt=0x7f890801dc20) at agent.c:1326
 #6  0x000055637f243448 in trace_ust_destroy_session (session=0x7f8908004010) at trace-ust.c:1408
 #7  0x000055637f1fd775 in session_release (ref=0x7f8908001e70) at session.c:873
 #8  0x000055637f1fb9ac in urcu_ref_put (ref=0x7f8908001e70, release=0x55637f1fd62a <session_release>) at /usr/include/urcu/ref.h:68
 #9  0x000055637f1fdad2 in session_put (session=0x7f8908000d10) at session.c:942
 #10 0x000055637f2369e6 in process_client_msg (cmd_ctx=0x7f890800e6e0, sock=0x7f8919308560, sock_error=0x7f8919308564) at client.c:2102
 #11 0x000055637f2375ab in thread_manage_clients (data=0x556380cd1840) at client.c:2347
 #12 0x000055637f22fb3a in launch_thread (data=0x556380cd18b0) at thread.c:65
 #13 0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #14 0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

T8 is holding session list lock while the cmd_destroy_session
command is being processed. More specifically, it is attempting
to destroy an "agent_context" by communicating with an "agent"
application.

Meanwhile, T13 is still registering that same "agent" application.

Cause
-----

The deadlock itself is pretty simple to understand.

The "agent thread" (T13) has the responsability of accepting new agent
application connections. When such a connection occurs, the thread
creates a new `agent_app` instance and sends the current sessions'
configuration (i.e. their event rules and contexts) to the agent
application. When that "update" is complete, a "registration done"
message is sent to the new agent application.

From the stacktrace above, we can see that T13 is attempting to update
the agent application with its initial configuration, but it is
blocked on the acquisition of the session list lock. The application's
agent is also blocked since it is waiting for the "registration done"
message before allowing tracing to proceed (not shown here, but seen
in the test logs).

Meanwhile, T8 is holding the session list lock while destroying a
session. This is expected as all client commands are executed with
this lock held. It is, amongst other reasons, used to serialize
changes to the sessions' configuration and configuration updates sent
to the tracers (i.e. because new apps appear or to keep existing
tracers in sync with the users' session configuration).

The question becomes: why is T8 tearing down an application that is
not yet registered?

First, inspecting `agent_app` immediately shows that this structure
has no built-in synchronization mechanism. Therefore, the fact that
two threads are accessing it at the same time raises a big red flag.

Speculating on the intentions of the original design, my intuition is
that the "agent_management" thread's role is limited to instantiating
an `agent_app` and synchronizing it with the various sessions'
configuration. Once that synchronization is performed, the agent
application should be published and never accessed again by the "agent
thread".

Configuration updates (i.e. new event rules, contexts) are then sent
synchronously as they are requested by a client in the context of the
client thread. Those updates are performed while holding the session
list lock.

Hence, there is only one thread that should manipulate the agent
application at any given time making an explicit `agent_app` lock
unnecessary.

Overall, this would echo what is done when a 'user space tracer'
application registers to the session daemon (see dispatch.c:368).

Evidently this isn't what is happening here.

The agent thread creates the `agent_app`, publishes it, and then
performs an "agent app update" (sending the configuration) while
holding the session list lock. This means that there is a window where
an agent application is visible to the other threads, yet has not been
properly registered.

Solution
--------

The acquisition of the session list lock is moved outside of
update_agent_app() to allow the "agent thread" to hold the session
list lock during the "configuration update" phase of the agent
application registration.

Essentially, the sequence of operation changes from:

- Agent tcp connection established
- call handle_registration()
  - agent version check
  - allocation of agent_app instance
  - new agent_add is published through the global agent_apps_ht_by_sock
    hashtable
    ***
    it is now reachable by all other threads without any form of
    exclusivity synchronization.
    ***
- update_agent_app
  - acquire session list lock
  - iterate over sessions
    - send configuration
  - release session list lock
- send registration done

to:

- Agent tcp connection established
- call accept_agent_registration()
  - agent version check
- allocation of agent_app instance
- acquire session list lock
- update_agent_app
  - iterate over sessions
    - send configuration
- send registration done
- new agent_add is published through the global agent_apps_ht_by_sock
  hashtable
- release session list lock

Links
-----

[1] https://github.com/lttng/lttng-ust-java-tests

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ia34c5ad81ed3936acbca756b425423e0cb8dbddf
jgalar pushed a commit that referenced this pull request May 19, 2020
Observed issue
==============
Core dump:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x0000003eb4025548 in __GI_abort () at abort.c:79
 #2  0x0000003eb402542f in __assert_fail_base (fmt=0x3eb4184ae0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=903, function=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed")
     at assert.c:92
 #3  0x0000003eb4033af2 in __GI___assert_fail (assertion=assertion@entry=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=file@entry=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=line@entry=903,
     function=function@entry=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed") at assert.c:101
 #4  0x000000000047f37e in lttng_trace_chunk_move_to_completed (trace_chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903
 #5  0x0000000000480755 in lttng_trace_chunk_release (ref=0x7fcb5c00e598) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1117
 #6  urcu_ref_put (release=<optimized out>, ref=0x7fcb5c00e598) at /usr/include/urcu/ref.h:68
 #7  lttng_trace_chunk_put (chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1150
 #8  0x0000000000429c22 in cmd_rotate_session (session=0x7fcb5c003ff0, rotate_return=rotate_return@entry=0x7fcb6b7ed470, quiet_rotation=quiet_rotation@entry=false)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/cmd.c:5037
 #9  0x00000000004451d7 in process_client_msg (cmd_ctx=0x7fcb5c00e760, sock=sock@entry=0x7fcb6b7fd4c0, sock_error=sock_error@entry=0x7fcb6b7fd4c4)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:1852
 #10 0x00000000004474c6 in thread_manage_clients (data=<optimized out>) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:2199
 #11 0x00000000004422f2 in launch_thread (data=0x4f97a0) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/thread.c:75
 #12 0x0000003eb4408ed4 in start_thread (arg=<optimized out>) at pthread_create.c:479
 #13 0x0000003eb40f8e6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproduction:

 Disable ntp/any time management mechanism.

 lttng create
 lttng enable-event -u 'lttng_ust_tracef:*'
 lttng start
 lttng rotate
 date --set="$(date --date='-1 hour')"
 lttng rotate auto-20200515-142503
    Waiting for rotation to complete
    Error: Failed to query the state of the rotation.

Logs:
 DEBUG1 - 12:25:28.570037987 [2660/2717]: Setting trace chunk close command to "move to completed chunk folder" (in lttng_trace_chunk_set_close_command() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1073)
 Error: Failed to set trace chunk close timestamp: close timestamp is before creation timestamp
 Error: Failed to set the close timestamp of the current trace chunk of session "auto-20200515-142503"
 lttng-sessiond: ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903: lttng_trace_chunk_move_to_completed: Assertion `(trace_chunk->timestamp_close).is_set' failed.

 ...

 Aborted (core dumped)
 root@X10SDV-8C-TLN4F:~# DEBUG1 - 12:25:29.534263017 [2739/2739]: Releasing trace chunk registry to all trace chunks (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1414)
 DEBUG1 - 12:25:29.534317468 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 2, name = "20200515T122528+0000-2", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534365653 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 1, name = "20200515T142520+0000-1", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534400638 [2739/2739]: Released reference to 2 trace chunks in lttng_trace_chunk_registry_put_each_chunk() (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1447)
 Error: 2 trace chunks are leaked by lttng-consumerd. This can be caused by an internal error of the session daemon.

Cause
=====
The trace_chunk->timestamp_close is not set since the result from time()
is smaller than the creation timestamp.

The close timestamp is smaller because the calendar system time is
modified by an administrator.

time() offers no monotonicity guarantee and hence is exposed to time
modification of the system.

The begin and close timestamps are strictly used in the name generation
of the chunk/archives. Given the current usage of these timestamps
validating monotonicity should not be a fatal error. Name uniqueness is
provided by the chunk name suffix (auto increment).

Solution
========
Do not enforce monotonicity for the begin and close timestamps but warn
on unexpected return (begin > close).

Known drawbacks
=========
None.

Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ic4b17285d150358d1569d6821c451c243e64e9a1
jgalar pushed a commit that referenced this pull request May 19, 2020
Observed issue
==============
Core dump:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x0000003eb4025548 in __GI_abort () at abort.c:79
 #2  0x0000003eb402542f in __assert_fail_base (fmt=0x3eb4184ae0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=903, function=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed")
     at assert.c:92
 #3  0x0000003eb4033af2 in __GI___assert_fail (assertion=assertion@entry=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=file@entry=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=line@entry=903,
     function=function@entry=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed") at assert.c:101
 #4  0x000000000047f37e in lttng_trace_chunk_move_to_completed (trace_chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903
 #5  0x0000000000480755 in lttng_trace_chunk_release (ref=0x7fcb5c00e598) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1117
 #6  urcu_ref_put (release=<optimized out>, ref=0x7fcb5c00e598) at /usr/include/urcu/ref.h:68
 #7  lttng_trace_chunk_put (chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1150
 #8  0x0000000000429c22 in cmd_rotate_session (session=0x7fcb5c003ff0, rotate_return=rotate_return@entry=0x7fcb6b7ed470, quiet_rotation=quiet_rotation@entry=false)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/cmd.c:5037
 #9  0x00000000004451d7 in process_client_msg (cmd_ctx=0x7fcb5c00e760, sock=sock@entry=0x7fcb6b7fd4c0, sock_error=sock_error@entry=0x7fcb6b7fd4c4)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:1852
 #10 0x00000000004474c6 in thread_manage_clients (data=<optimized out>) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:2199
 #11 0x00000000004422f2 in launch_thread (data=0x4f97a0) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/thread.c:75
 #12 0x0000003eb4408ed4 in start_thread (arg=<optimized out>) at pthread_create.c:479
 #13 0x0000003eb40f8e6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproduction:

 Disable ntp/any time management mechanism.

 lttng create
 lttng enable-event -u 'lttng_ust_tracef:*'
 lttng start
 lttng rotate
 date --set="$(date --date='-1 hour')"
 lttng rotate auto-20200515-142503
    Waiting for rotation to complete
    Error: Failed to query the state of the rotation.

Logs:
 DEBUG1 - 12:25:28.570037987 [2660/2717]: Setting trace chunk close command to "move to completed chunk folder" (in lttng_trace_chunk_set_close_command() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1073)
 Error: Failed to set trace chunk close timestamp: close timestamp is before creation timestamp
 Error: Failed to set the close timestamp of the current trace chunk of session "auto-20200515-142503"
 lttng-sessiond: ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903: lttng_trace_chunk_move_to_completed: Assertion `(trace_chunk->timestamp_close).is_set' failed.

 ...

 Aborted (core dumped)
 root@X10SDV-8C-TLN4F:~# DEBUG1 - 12:25:29.534263017 [2739/2739]: Releasing trace chunk registry to all trace chunks (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1414)
 DEBUG1 - 12:25:29.534317468 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 2, name = "20200515T122528+0000-2", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534365653 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 1, name = "20200515T142520+0000-1", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534400638 [2739/2739]: Released reference to 2 trace chunks in lttng_trace_chunk_registry_put_each_chunk() (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1447)
 Error: 2 trace chunks are leaked by lttng-consumerd. This can be caused by an internal error of the session daemon.

Cause
=====
The trace_chunk->timestamp_close is not set since the result from time()
is smaller than the creation timestamp.

The close timestamp is smaller because the calendar system time is
modified by an administrator.

time() offers no monotonicity guarantee and hence is exposed to time
modification of the system.

The begin and close timestamps are strictly used in the name generation
of the chunk/archives. Given the current usage of these timestamps
validating monotonicity should not be a fatal error. Name uniqueness is
provided by the chunk name suffix (auto increment).

Solution
========
Do not enforce monotonicity for the begin and close timestamps but warn
on unexpected return (begin > close).

Known drawbacks
=========
None.

Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ic4b17285d150358d1569d6821c451c243e64e9a1
jgalar pushed a commit that referenced this pull request May 19, 2020
Observed issue
==============
Core dump:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x0000003eb4025548 in __GI_abort () at abort.c:79
 #2  0x0000003eb402542f in __assert_fail_base (fmt=0x3eb4184ae0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=903, function=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed")
     at assert.c:92
 #3  0x0000003eb4033af2 in __GI___assert_fail (assertion=assertion@entry=0x4cdee0 "(trace_chunk->timestamp_close).is_set",
     file=file@entry=0x4cde78 "../../../lttng-tools-2.11.3/src/common/trace-chunk.c", line=line@entry=903,
     function=function@entry=0x4cf4a0 <__PRETTY_FUNCTION__.6756> "lttng_trace_chunk_move_to_completed") at assert.c:101
 #4  0x000000000047f37e in lttng_trace_chunk_move_to_completed (trace_chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903
 #5  0x0000000000480755 in lttng_trace_chunk_release (ref=0x7fcb5c00e598) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1117
 #6  urcu_ref_put (release=<optimized out>, ref=0x7fcb5c00e598) at /usr/include/urcu/ref.h:68
 #7  lttng_trace_chunk_put (chunk=0x7fcb5c00e570) at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1150
 #8  0x0000000000429c22 in cmd_rotate_session (session=0x7fcb5c003ff0, rotate_return=rotate_return@entry=0x7fcb6b7ed470, quiet_rotation=quiet_rotation@entry=false)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/cmd.c:5037
 #9  0x00000000004451d7 in process_client_msg (cmd_ctx=0x7fcb5c00e760, sock=sock@entry=0x7fcb6b7fd4c0, sock_error=sock_error@entry=0x7fcb6b7fd4c4)
     at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:1852
 #10 0x00000000004474c6 in thread_manage_clients (data=<optimized out>) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/client.c:2199
 #11 0x00000000004422f2 in launch_thread (data=0x4f97a0) at ../../../../lttng-tools-2.11.3/src/bin/lttng-sessiond/thread.c:75
 #12 0x0000003eb4408ed4 in start_thread (arg=<optimized out>) at pthread_create.c:479
 #13 0x0000003eb40f8e6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproduction:

 Disable ntp/any time management mechanism.

 lttng create
 lttng enable-event -u 'lttng_ust_tracef:*'
 lttng start
 lttng rotate
 date --set="$(date --date='-1 hour')"
 lttng rotate auto-20200515-142503
    Waiting for rotation to complete
    Error: Failed to query the state of the rotation.

Logs:
 DEBUG1 - 12:25:28.570037987 [2660/2717]: Setting trace chunk close command to "move to completed chunk folder" (in lttng_trace_chunk_set_close_command() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1073)
 Error: Failed to set trace chunk close timestamp: close timestamp is before creation timestamp
 Error: Failed to set the close timestamp of the current trace chunk of session "auto-20200515-142503"
 lttng-sessiond: ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:903: lttng_trace_chunk_move_to_completed: Assertion `(trace_chunk->timestamp_close).is_set' failed.

 ...

 Aborted (core dumped)
 root@X10SDV-8C-TLN4F:~# DEBUG1 - 12:25:29.534263017 [2739/2739]: Releasing trace chunk registry to all trace chunks (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1414)
 DEBUG1 - 12:25:29.534317468 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 2, name = "20200515T122528+0000-2", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534365653 [2739/2739]: Releasing reference to trace chunk: session_id = 0chunk_id = 1, name = "20200515T142520+0000-1", status = closed (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1435)
 DEBUG1 - 12:25:29.534400638 [2739/2739]: Released reference to 2 trace chunks in lttng_trace_chunk_registry_put_each_chunk() (in lttng_trace_chunk_registry_put_each_chunk() at ../../../lttng-tools-2.11.3/src/common/trace-chunk.c:1447)
 Error: 2 trace chunks are leaked by lttng-consumerd. This can be caused by an internal error of the session daemon.

Cause
=====
The trace_chunk->timestamp_close is not set since the result from time()
is smaller than the creation timestamp.

The close timestamp is smaller because the calendar system time is
modified by an administrator.

time() offers no monotonicity guarantee and hence is exposed to time
modification of the system.

The begin and close timestamps are strictly used in the name generation
of the chunk/archives. Given the current usage of these timestamps
validating monotonicity should not be a fatal error. Name uniqueness is
provided by the chunk name suffix (auto increment).

Solution
========
Do not enforce monotonicity for the begin and close timestamps but warn
on unexpected return (begin > close).

Known drawbacks
=========
None.

Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ic4b17285d150358d1569d6821c451c243e64e9a1
frdeso pushed a commit to frdeso/lttng-tools that referenced this pull request May 26, 2020
Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 lttng#3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 lttng#4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 lttng#5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 lttng#6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 lttng#7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 lttng#8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 lttng#3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 lttng#4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 lttng#5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 lttng#6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 lttng#7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 lttng#8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 lttng#9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 lttng#10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 lttng#11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 lttng#3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 lttng#4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 lttng#5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <[email protected]>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc
Depends-on: lttng-ust: I8423c510bf6af2f9bf85256e8d6f931d36f7054b
frdeso pushed a commit to frdeso/lttng-tools that referenced this pull request May 27, 2020
Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 lttng#3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 lttng#4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 lttng#5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 lttng#6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 lttng#7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 lttng#8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 lttng#3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 lttng#4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 lttng#5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 lttng#6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 lttng#7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 lttng#8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 lttng#9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 lttng#10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 lttng#11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 lttng#3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 lttng#4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 lttng#5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <[email protected]>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc
Depends-on: lttng-ust: I8423c510bf6af2f9bf85256e8d6f931d36f7054b
jgalar added a commit that referenced this pull request May 29, 2020
Observed issue
--------------

While running the out-of-tree java agent tests [1], the session daemon
and agent often end up in a deadlock.

Attaching gdb to the session daemon, we can see that two threads are
blocked in an intriguing state.

Thread 13 (Thread 0x7f89027fc700 (LWP 9636)):
 #0  0x00007f891e81a4cf in __lll_lock_wait () from /usr/lib/libpthread.so.0
 #1  0x00007f891e812e03 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
 #2  0x000055637f1fbd92 in session_lock_list () at session.c:156
 #3  0x000055637f25dc47 in update_agent_app (app=0x7f88ec003480) at agent-thread.c:56
 #4  0x000055637f25ec0a in thread_agent_management (data=0x556380cd2400) at agent-thread.c:426
 #5  0x000055637f22fb3a in launch_thread (data=0x556380cd24a0) at thread.c:65
 #6  0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #7  0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

Thread 8 (Thread 0x7f8919309700 (LWP 9631)):
 #0  0x00007f891e81b44d in recvmsg () from /usr/lib/libpthread.so.0
 #1  0x000055637f267847 in lttcomm_recvmsg_inet_sock (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, len=4, flags=0) at inet.c:367
 #2  0x000055637f2146c6 in recv_reply (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, size=4) at agent.c:275
 #3  0x000055637f215202 in app_context_op (app=0x7f88ec003400, ctx=0x7f8908020900, cmd=AGENT_CMD_APP_CTX_DISABLE) at agent.c:552
 #4  0x000055637f215c2d in disable_context (ctx=0x7f8908020900, domain=LTTNG_DOMAIN_JUL) at agent.c:841
 #5  0x000055637f217480 in agent_destroy (agt=0x7f890801dc20) at agent.c:1326
 #6  0x000055637f243448 in trace_ust_destroy_session (session=0x7f8908004010) at trace-ust.c:1408
 #7  0x000055637f1fd775 in session_release (ref=0x7f8908001e70) at session.c:873
 #8  0x000055637f1fb9ac in urcu_ref_put (ref=0x7f8908001e70, release=0x55637f1fd62a <session_release>) at /usr/include/urcu/ref.h:68
 #9  0x000055637f1fdad2 in session_put (session=0x7f8908000d10) at session.c:942
 #10 0x000055637f2369e6 in process_client_msg (cmd_ctx=0x7f890800e6e0, sock=0x7f8919308560, sock_error=0x7f8919308564) at client.c:2102
 #11 0x000055637f2375ab in thread_manage_clients (data=0x556380cd1840) at client.c:2347
 #12 0x000055637f22fb3a in launch_thread (data=0x556380cd18b0) at thread.c:65
 #13 0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #14 0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

T8 is holding session list lock while the cmd_destroy_session
command is being processed. More specifically, it is attempting
to destroy an "agent_context" by communicating with an "agent"
application.

Meanwhile, T13 is still registering that same "agent" application.

Cause
-----

The deadlock itself is pretty simple to understand.

The "agent thread" (T13) has the responsability of accepting new agent
application connections. When such a connection occurs, the thread
creates a new `agent_app` instance and sends the current sessions'
configuration (i.e. their event rules and contexts) to the agent
application. When that "update" is complete, a "registration done"
message is sent to the new agent application.

From the stacktrace above, we can see that T13 is attempting to update
the agent application with its initial configuration, but it is
blocked on the acquisition of the session list lock. The application's
agent is also blocked since it is waiting for the "registration done"
message before allowing tracing to proceed (not shown here, but seen
in the test logs).

Meanwhile, T8 is holding the session list lock while destroying a
session. This is expected as all client commands are executed with
this lock held. It is, amongst other reasons, used to serialize
changes to the sessions' configuration and configuration updates sent
to the tracers (i.e. because new apps appear or to keep existing
tracers in sync with the users' session configuration).

The question becomes: why is T8 tearing down an application that is
not yet registered?

First, inspecting `agent_app` immediately shows that this structure
has no built-in synchronization mechanism. Therefore, the fact that
two threads are accessing it at the same time raises a big red flag.

Speculating on the intentions of the original design, my intuition is
that the "agent_management" thread's role is limited to instantiating
an `agent_app` and synchronizing it with the various sessions'
configuration. Once that synchronization is performed, the agent
application should be published and never accessed again by the "agent
thread".

Configuration updates (i.e. new event rules, contexts) are then sent
synchronously as they are requested by a client in the context of the
client thread. Those updates are performed while holding the session
list lock.

Hence, there is only one thread that should manipulate the agent
application at any given time making an explicit `agent_app` lock
unnecessary.

Overall, this would echo what is done when a 'user space tracer'
application registers to the session daemon (see dispatch.c:368).

Evidently this isn't what is happening here.

The agent thread creates the `agent_app`, publishes it, and then
performs an "agent app update" (sending the configuration) while
holding the session list lock. This means that there is a window where
an agent application is visible to the other threads, yet has not been
properly registered.

Solution
--------

The acquisition of the session list lock is moved outside of
update_agent_app() to allow the "agent thread" to hold the session
list lock during the "configuration update" phase of the agent
application registration.

Essentially, the sequence of operation changes from:

- Agent tcp connection established
- call handle_registration()
  - agent version check
  - allocation of agent_app instance
  - new agent_add is published through the global agent_apps_ht_by_sock
    hashtable
    ***
    it is now reachable by all other threads without any form of
    exclusivity synchronization.
    ***
- update_agent_app
  - acquire session list lock
  - iterate over sessions
    - send configuration
  - release session list lock
- send registration done

to:

- Agent tcp connection established
- call accept_agent_registration()
  - agent version check
- allocation of agent_app instance
- acquire session list lock
- update_agent_app
  - iterate over sessions
    - send configuration
- send registration done
- new agent_add is published through the global agent_apps_ht_by_sock
  hashtable
- release session list lock

Links
-----

[1] https://github.com/lttng/lttng-ust-java-tests

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ia34c5ad81ed3936acbca756b425423e0cb8dbddf
jgalar added a commit that referenced this pull request May 29, 2020
Observed issue
--------------

While running the out-of-tree java agent tests [1], the session daemon
and agent often end up in a deadlock.

Attaching gdb to the session daemon, we can see that two threads are
blocked in an intriguing state.

Thread 13 (Thread 0x7f89027fc700 (LWP 9636)):
 #0  0x00007f891e81a4cf in __lll_lock_wait () from /usr/lib/libpthread.so.0
 #1  0x00007f891e812e03 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
 #2  0x000055637f1fbd92 in session_lock_list () at session.c:156
 #3  0x000055637f25dc47 in update_agent_app (app=0x7f88ec003480) at agent-thread.c:56
 #4  0x000055637f25ec0a in thread_agent_management (data=0x556380cd2400) at agent-thread.c:426
 #5  0x000055637f22fb3a in launch_thread (data=0x556380cd24a0) at thread.c:65
 #6  0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #7  0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

Thread 8 (Thread 0x7f8919309700 (LWP 9631)):
 #0  0x00007f891e81b44d in recvmsg () from /usr/lib/libpthread.so.0
 #1  0x000055637f267847 in lttcomm_recvmsg_inet_sock (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, len=4, flags=0) at inet.c:367
 #2  0x000055637f2146c6 in recv_reply (sock=0x7f88ec0033c0, buf=0x7f89192f5d5c, size=4) at agent.c:275
 #3  0x000055637f215202 in app_context_op (app=0x7f88ec003400, ctx=0x7f8908020900, cmd=AGENT_CMD_APP_CTX_DISABLE) at agent.c:552
 #4  0x000055637f215c2d in disable_context (ctx=0x7f8908020900, domain=LTTNG_DOMAIN_JUL) at agent.c:841
 #5  0x000055637f217480 in agent_destroy (agt=0x7f890801dc20) at agent.c:1326
 #6  0x000055637f243448 in trace_ust_destroy_session (session=0x7f8908004010) at trace-ust.c:1408
 #7  0x000055637f1fd775 in session_release (ref=0x7f8908001e70) at session.c:873
 #8  0x000055637f1fb9ac in urcu_ref_put (ref=0x7f8908001e70, release=0x55637f1fd62a <session_release>) at /usr/include/urcu/ref.h:68
 #9  0x000055637f1fdad2 in session_put (session=0x7f8908000d10) at session.c:942
 #10 0x000055637f2369e6 in process_client_msg (cmd_ctx=0x7f890800e6e0, sock=0x7f8919308560, sock_error=0x7f8919308564) at client.c:2102
 #11 0x000055637f2375ab in thread_manage_clients (data=0x556380cd1840) at client.c:2347
 #12 0x000055637f22fb3a in launch_thread (data=0x556380cd18b0) at thread.c:65
 #13 0x00007f891e81046f in start_thread () from /usr/lib/libpthread.so.0
 #14 0x00007f891e7203d3 in clone () from /usr/lib/libc.so.6

T8 is holding session list lock while the cmd_destroy_session
command is being processed. More specifically, it is attempting
to destroy an "agent_context" by communicating with an "agent"
application.

Meanwhile, T13 is still registering that same "agent" application.

Cause
-----

The deadlock itself is pretty simple to understand.

The "agent thread" (T13) has the responsability of accepting new agent
application connections. When such a connection occurs, the thread
creates a new `agent_app` instance and sends the current sessions'
configuration (i.e. their event rules and contexts) to the agent
application. When that "update" is complete, a "registration done"
message is sent to the new agent application.

From the stacktrace above, we can see that T13 is attempting to update
the agent application with its initial configuration, but it is
blocked on the acquisition of the session list lock. The application's
agent is also blocked since it is waiting for the "registration done"
message before allowing tracing to proceed (not shown here, but seen
in the test logs).

Meanwhile, T8 is holding the session list lock while destroying a
session. This is expected as all client commands are executed with
this lock held. It is, amongst other reasons, used to serialize
changes to the sessions' configuration and configuration updates sent
to the tracers (i.e. because new apps appear or to keep existing
tracers in sync with the users' session configuration).

The question becomes: why is T8 tearing down an application that is
not yet registered?

First, inspecting `agent_app` immediately shows that this structure
has no built-in synchronization mechanism. Therefore, the fact that
two threads are accessing it at the same time raises a big red flag.

Speculating on the intentions of the original design, my intuition is
that the "agent_management" thread's role is limited to instantiating
an `agent_app` and synchronizing it with the various sessions'
configuration. Once that synchronization is performed, the agent
application should be published and never accessed again by the "agent
thread".

Configuration updates (i.e. new event rules, contexts) are then sent
synchronously as they are requested by a client in the context of the
client thread. Those updates are performed while holding the session
list lock.

Hence, there is only one thread that should manipulate the agent
application at any given time making an explicit `agent_app` lock
unnecessary.

Overall, this would echo what is done when a 'user space tracer'
application registers to the session daemon (see dispatch.c:368).

Evidently this isn't what is happening here.

The agent thread creates the `agent_app`, publishes it, and then
performs an "agent app update" (sending the configuration) while
holding the session list lock. This means that there is a window where
an agent application is visible to the other threads, yet has not been
properly registered.

Solution
--------

The acquisition of the session list lock is moved outside of
update_agent_app() to allow the "agent thread" to hold the session
list lock during the "configuration update" phase of the agent
application registration.

Essentially, the sequence of operation changes from:

- Agent tcp connection established
- call handle_registration()
  - agent version check
  - allocation of agent_app instance
  - new agent_add is published through the global agent_apps_ht_by_sock
    hashtable
    ***
    it is now reachable by all other threads without any form of
    exclusivity synchronization.
    ***
- update_agent_app
  - acquire session list lock
  - iterate over sessions
    - send configuration
  - release session list lock
- send registration done

to:

- Agent tcp connection established
- call accept_agent_registration()
  - agent version check
- allocation of agent_app instance
- acquire session list lock
- update_agent_app
  - iterate over sessions
    - send configuration
- send registration done
- new agent_add is published through the global agent_apps_ht_by_sock
  hashtable
- release session list lock

Links
-----

[1] https://github.com/lttng/lttng-ust-java-tests

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ia34c5ad81ed3936acbca756b425423e0cb8dbddf
frdeso pushed a commit to frdeso/lttng-tools that referenced this pull request Jun 3, 2020
Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 lttng#3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 lttng#4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 lttng#5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 lttng#6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 lttng#7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 lttng#8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 lttng#3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 lttng#4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 lttng#5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 lttng#6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 lttng#7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 lttng#8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 lttng#9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 lttng#10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 lttng#11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 lttng#3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 lttng#4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 lttng#5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <[email protected]>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc
Depends-on: lttng-ust: I8423c510bf6af2f9bf85256e8d6f931d36f7054b
jgalar pushed a commit that referenced this pull request Aug 11, 2020
Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 #3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 #4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 #5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 #6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 #7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 #8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 #3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 #4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 #5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 #6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 #7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 #8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 #9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 #10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 #11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 #3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 #4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 #5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc
frdeso pushed a commit to frdeso/lttng-tools that referenced this pull request Aug 17, 2020
Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 lttng#3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 lttng#4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 lttng#5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 lttng#6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 lttng#7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 lttng#8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 lttng#3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 lttng#4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 lttng#5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 lttng#6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 lttng#7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 lttng#8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 lttng#9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 lttng#10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 lttng#11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 lttng#3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 lttng#4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 lttng#5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 lttng#6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <[email protected]>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc
jgalar added a commit that referenced this pull request Oct 16, 2020
Observed issue
==============

The clear tests occasionally fail with the following babeltrace error
when a live session is stopped following a "clear". Unfortunately, this
problem only seems to occur on certain machines. In my case, I only
managed to reproduce this on the CI's workers.

  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER [email protected]:1610 [lttng-live] Received get_data_packet response: error
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER [email protected]:563 [lttng-live] User function failed: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER [email protected]:2899 [lttng-live] Cannot handle state: msg-it-addr=0x5603c28e2830, state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_iterator_next_handle_one_active_data_stream@lttng-live.c:845 [lttng-live] CTF message iterator failed to get next message: msg-iter=0x5603c28e2830, msg-iter-status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE [email protected]:1665 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER [email protected]:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER [email protected]:454 [muxer] Upstream iterator's next method returned an error: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER [email protected]:991 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x5603c28dbe70, muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER [email protected]:1415 [muxer] Cannot get next message: comp-addr=0x5603c28dc960, muxer-comp-addr=0x5603c28db0a0, muxer-msg-iter-addr=0x5603c28dbe70, msg-iter-addr=0x5603c28caf80, status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER [email protected]:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/GRAPH [email protected]:473 Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140, comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
  10-07 12:39:48.333  7679  7679 E CLI [email protected]:2548 Graph failed to complete successfully
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER [email protected]:1227 [lttng-live] Unknown detach return code 0

  ERROR:    [Babeltrace CLI] (babeltrace2.c:2548)
    Graph failed to complete successfully
  CAUSED BY [libbabeltrace2] (graph.c:473)
    Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60,
    comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
    comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
    (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140,
    comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
    comp-input-port-count=1, comp-output-port-count=0
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
    iter-upstream-comp-class-name="muxer",
    iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:991)
    Cannot validate muxer's upstream message iterator wrapper:
    muxer-msg-iter-addr=0x5603c28dbe70,
    muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:454)
    Upstream iterator's next method returned an error: status=ERROR
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
    iter-upstream-comp-class-name="lttng-live",
    iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1665)
    Error preparing the next batch of messages:
    live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:845)
    CTF message iterator failed to get next message: msg-iter=0x5603c28e2830,
    msg-iter-status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:2899)
    Cannot handle state: msg-it-addr=0x5603c28e2830,
    state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:563)
    User function failed: status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (viewer-connection.c:1610)
    Received get_data_packet response: error

This occurs immediately following a 'stop' on the session. As the error
indicates, a request to obtain a data packet fails with a generic
error reply.

Moreover, the following LTTNG_VIEWER_DETACH_SESSION appears to fail
with an invalid status code. This is addressed in a different commit.

Reproducing the test's failure without redirecting the relay daemon's
allows us to see the following errors after the first stop:
  PERROR - 14:33:44.929675253 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.030037417 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.130429370 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.230829447 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.331223320 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)

This is produced with the following back-trace:
  (gdb) bt
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007ffff69648b1 in __GI_abort () at abort.c:79
  #2  0x00005555555b4f1f in fd_tracker_open_fs_handle (tracker=0x55555582c620, directory=0x7fffe8006680,
      path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0, mode=0x7ffff0a24508) at fd-tracker.c:550
  #3  0x0000555555595c34 in _lttng_trace_chunk_open_fs_handle_locked (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx",
      flags=0, mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1388
  #4  0x0000555555595eef in lttng_trace_chunk_open_fs_handle (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0,
      mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1433
  #5  0x00005555555da6c2 in _lttng_index_file_create_from_trace_chunk (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, unlink_existing_file=false, flags=0,
      expect_no_file=true, file=0x7fffe0002270) at index.c:97
  #6  0x00005555555dad8a in lttng_index_file_create_from_trace_chunk_read_only (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, expect_no_file=true, file=0x7fffe0002270)
      at index.c:186
  #7  0x000055555557640f in try_open_index (vstream=0x7fffe0002250, rstream=0x7fffe8018c50) at live.c:1378
  #8  0x0000555555577155 in viewer_get_next_index (conn=0x7fffd4001440) at live.c:1643
  #9  0x0000555555579a01 in process_control (recv_hdr=0x7ffff0a27c30, conn=0x7fffd4001440) at live.c:2311
  #10 0x000055555557a1db in thread_worker (data=0x0) at live.c:2482
  #11 0x00007ffff6d1c6db in start_thread (arg=0x7ffff0a28700) at pthread_create.c:463
  #12 0x00007ffff6a45a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

That problem is mostly cosmetic in nature (the open can fail
"legitimately") as the PERROR should simply not be printed and is
addressed in a different commit.

This error is also produced after a 'clear' is issued:
  PERROR - 14:33:45.532782268 [25108/25115]: Failed to read from file system handle of viewer stream id 1, offset: 4096: No such file or directory (in viewer_get_packet() at live.c:1849)

Which is produced with the following back-trace:
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007f53e297c8b1 in __GI_abort () at abort.c:79
  #2  0x000055dd77ccef2c in viewer_get_packet (conn=0x7f53c4001100) at live.c:1850
  #3  0x000055dd77cd0a15 in process_control (recv_hdr=0x7f53dca3fc30, conn=0x7f53c4001100) at live.c:2315
  #4  0x000055dd77cd11db in thread_worker (data=0x0) at live.c:2483
  #5  0x00007f53e2d346db in start_thread (arg=0x7f53dca40700) at pthread_create.c:463
  #6  0x00007f53e2a5da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

A similar problem occurs, although more rarely, when reading an
index entry in viewer_get_next_index().

Cause
=====

The following situation leads to both failures to get a
packet and failures to get the next index:
  - Viewer connects to an existing session,
  - Viewer consumes a number of packets, alternating the
    GET_NEXT_INDEX and GET_PACKET command,
  - The session's streams are rotated to a new trace chunk
    (as part of a clear),
  - The session is started and stopped, causing new packets
    to be produced and received,
  - The session is stopped and destroyed, causing the session's
    streams to rotate into a "null" trace chunk (no active
    trace files),
  - Viewer issues GET_NEXT_INDEX or GET_PACKET, but the fact
    that a rotation occurred on the receiving end is not detected
    as the relay streams' trace chunk are "null".

The crux of the problem is that lttng_trace_chunk_ids_equal() is
bypassed when the current trace chunk of a relay stream is "null".

The rationale for skipping this check is that it is assumed that the
files currently opened by the live server can can still be used even
if the consumer has rotated the corresponding streams into a 'null'
trace chunk, meaning no trace chunk is 'set' for those streams.

This makes sense in one scenario: the session was destroyed and we wish
to allow a connected live client to finish consuming the trace packets
up to the end of the session's lifetime.

Here, the situation is different. The viewer is reading chunk 'A'.
Meanwhile, a rotation occurs into chunk 'B' and packets are received for
chunk 'B'. Then, a rotation to a 'null' chunk (no active chunk) occurs.

In essence, the live server never sees the rotation between chunk 'A'
and 'B', and simply assumes that a rotation from 'A' to 'null' occurred,
as would happen at the end of a session.

In terms of the code, in viewer_get_next_index(), a call to
check_index_status() is performed to determine if an index is available.
The function checks that `index_received_seqcount` is greater than
`index_sent_seqcount`. In that case, it determines that an index must be
available.

Unfortunately, there is no way for the live server to determine that the
remaining indexes are in a chunk that doesn't exist anymore (chunk 'B').
Thus, viewer_get_next_index() attempts to read an index entry from the
current index file and fails.

Solution
========

1) lttng_trace_chunk_ids_equal() is modified to properly handle
'null' trace chunks:
  - A null and a non-null trace chunk are not equal,
  - Two null trace chunks are equal.

2) Rotation count
  A rotation counter is introduced to track the number of rotations
  that occurred during a relay stream's lifetime. This counter is
  sampled by the matching viewer streams on creation and on rotation
  and is used to determine if all rotations were "seen" by the viewer
  stream.

  Hence, this allows us to handle the special case where a viewer
  is consuming the contents of a relay stream that just transitioned
  into a 'null' trace chunk (see comments in patch).

The rest of the modifications simply allow the live server to handle
null trace chunks in viewer streams. This fixes another unrelated bug
that I observed while investigating this: sessions that don't have an
active trace chunk are not shown when listing sessions with babeltrace.

To reproduce, simply stop, clear a session, and attempt to list the
sessions of the associated relay daemon.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ibb3116990e34b7ec3b477f3482d0c0ff1e848d09
frdeso added a commit to frdeso/lttng-tools that referenced this pull request Dec 15, 2020
…tion

Issue
=====
The code of this function triggers the following heap-buffer-overflow
warning when compiled with `-fsanitize=address` in specific situation:

  ==247225==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000001310 at pc 0x5559db6c575a bp 0x7f193e6faeb0 sp 0x7f193e6faea0
  READ of size 4 at 0x602000001310 thread T4 (Notification)
      #0 0x5559db6c5759 in hashlittle /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:315
      #1 0x5559db6c6df4 in hash_key_str /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:490
      #2 0x5559db5e3282 in hash_trigger_by_name_uid /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:378
      lttng#3 0x5559db5ecbe3 in trigger_name_taken /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2333
      lttng#4 0x5559db5ecd7c in generate_trigger_name /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2362
      lttng#5 0x5559db5ed6e0 in handle_notification_thread_command_register_trigger /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2491
      lttng#6 0x5559db5ef967 in handle_notification_thread_command /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2927
      lttng#7 0x5559db5ddbb7 in thread_notification /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread.c:693
      lttng#8 0x5559db60e56d in launch_thread /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/thread.c:66
      lttng#9 0x7f19456ec608 in start_thread /build/glibc-ZN95T4/glibc-2.31/nptl/pthread_create.c:477
      lttng#10 0x7f1945602292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

Given that the `k` pointer used in this loop is a `uint32_t *` we might
read bytes outside of the allocated key if the key is less than 4 bytes
long. As the comment about Valgrind explains, this is not a real problem
because memory protections are typically word bounded.

I tried to use the `__SANITIZE_ADDRESS__` define to select the
Valgrind implementation of this code when building with AddressSanitizer
but that still triggers the same head-buffer-overflow warning.

Why wasn't that a problem before?
=======================================
The trigger feature will use small default names like "T0".

Workaround
==========
Exclude this function from the sanitizing using the compiler attribute
"no_sanitize_address".

Drawback
========
This remove our sanitizing coverage for this function.

Signed-off-by: Francis Deslauriers <[email protected]>
Change-Id: I82d0d3539916ed889faa93871f9b700064f2c52a
frdeso added a commit to frdeso/lttng-tools that referenced this pull request Dec 16, 2020
…tion

Issue
=====
The code of this function triggers the following heap-buffer-overflow
warning when compiled with `-fsanitize=address` in specific situation:

  ==247225==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000001310 at pc 0x5559db6c575a bp 0x7f193e6faeb0 sp 0x7f193e6faea0
  READ of size 4 at 0x602000001310 thread T4 (Notification)
      #0 0x5559db6c5759 in hashlittle /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:315
      #1 0x5559db6c6df4 in hash_key_str /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:490
      #2 0x5559db5e3282 in hash_trigger_by_name_uid /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:378
      lttng#3 0x5559db5ecbe3 in trigger_name_taken /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2333
      lttng#4 0x5559db5ecd7c in generate_trigger_name /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2362
      lttng#5 0x5559db5ed6e0 in handle_notification_thread_command_register_trigger /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2491
      lttng#6 0x5559db5ef967 in handle_notification_thread_command /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2927
      lttng#7 0x5559db5ddbb7 in thread_notification /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread.c:693
      lttng#8 0x5559db60e56d in launch_thread /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/thread.c:66
      lttng#9 0x7f19456ec608 in start_thread /build/glibc-ZN95T4/glibc-2.31/nptl/pthread_create.c:477
      lttng#10 0x7f1945602292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

Given that the `k` pointer used in this loop is a `uint32_t *` we might
read bytes outside of the allocated key if the key is less than 4 bytes
long. As the comment about Valgrind explains, this is not a real problem
because memory protections are typically word bounded.

I tried to use the `__SANITIZE_ADDRESS__` define to select the
Valgrind implementation of this code when building with AddressSanitizer
but that still triggers the same head-buffer-overflow warning.

Why wasn't that a problem before?
=======================================
The trigger feature will use small default names like "T0".

Workaround
==========
Exclude this function from the sanitizing using the compiler attribute
"no_sanitize_address".

Drawback
========
This remove our sanitizing coverage for this function.

Signed-off-by: Francis Deslauriers <[email protected]>
Change-Id: I82d0d3539916ed889faa93871f9b700064f2c52a
jgalar pushed a commit that referenced this pull request Jan 5, 2021
…tion

Issue
=====
The code of this function triggers the following heap-buffer-overflow
warning when compiled with `-fsanitize=address` in specific situation:

  ==247225==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000001310 at pc 0x5559db6c575a bp 0x7f193e6faeb0 sp 0x7f193e6faea0
  READ of size 4 at 0x602000001310 thread T4 (Notification)
      #0 0x5559db6c5759 in hashlittle /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:315
      #1 0x5559db6c6df4 in hash_key_str /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:490
      #2 0x5559db5e3282 in hash_trigger_by_name_uid /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:378
      #3 0x5559db5ecbe3 in trigger_name_taken /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2333
      #4 0x5559db5ecd7c in generate_trigger_name /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2362
      #5 0x5559db5ed6e0 in handle_notification_thread_command_register_trigger /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2491
      #6 0x5559db5ef967 in handle_notification_thread_command /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2927
      #7 0x5559db5ddbb7 in thread_notification /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread.c:693
      #8 0x5559db60e56d in launch_thread /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/thread.c:66
      #9 0x7f19456ec608 in start_thread /build/glibc-ZN95T4/glibc-2.31/nptl/pthread_create.c:477
      #10 0x7f1945602292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

Given that the `k` pointer used in this loop is a `uint32_t *` we might
read bytes outside of the allocated key if the key is less than 4 bytes
long. As the comment about Valgrind explains, this is not a real problem
because memory protections are typically word bounded.

I tried to use the `__SANITIZE_ADDRESS__` define to select the
Valgrind implementation of this code when building with AddressSanitizer
but that still triggers the same head-buffer-overflow warning.

Why wasn't that a problem before?
=======================================
The trigger feature will use small default names like "T0".

Workaround
==========
Exclude this function from the sanitizing using the compiler attribute
"no_sanitize_address".

Drawback
========
This removes our sanitizing coverage for this function.

Signed-off-by: Francis Deslauriers <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I82d0d3539916ed889faa93871f9b700064f2c52a
jgalar pushed a commit that referenced this pull request Apr 13, 2021
When running

  $ lttng add-trigger --condition on-event -u ust_tests_demo2:loop --capture intfield --action notify

I get the leaks pasted below. It seems like filter_parser_ctx_free
doesn't free everything in filter_parser_ctx. Add what's missing.
Re-order the frees so that they are in the same order as the members of
the struct, just because it's easier to follow and make sure we didn't
forget anything.

=================================================================
==1073803==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a3d61 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:667
    #4 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a1b94 in visit_node_load_expression_legacy /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:198
    #4 0x5555556a1d18 in visit_node_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:231
    #5 0x5555556a2540 in visit_node_load /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:399
    #6 0x5555556a3a8b in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:622
    #7 0x5555556a12fa in visit_node_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:53
    #8 0x5555556a3a76 in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:620
    #9 0x5555556a3c55 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:661
    #10 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #11 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #12 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #13 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #14 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #15 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #16 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #17 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #18 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a3dd2 in make_op_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:35
    #2 0x5555556a73a5 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:874
    #3 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #4 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4f1d in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:280
    #2 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #3 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #4 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #5 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #6 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #7 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #8 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #9 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #10 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #11 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #12 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #13 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #14 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a484d in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:201
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4e64 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:262
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4bbc in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:233
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 9 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff761fa69 in __interceptor_strdup /build/gcc/src/gcc/libsanitizer/asan/asan_interceptors.cpp:452
    #1 0x5555556a4c41 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:238
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4829 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:196
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

SUMMARY: AddressSanitizer: 409 byte(s) leaked in 9 allocation(s).

Change-Id: I04f9eb5ab7b18ae4ffdf7a49842768a6fdae5dbc
Signed-off-by: Simon Marchi <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
jgalar added a commit that referenced this pull request Apr 22, 2021
Issue observed
==============

lt-trigger_name: trigger.c:302: int lttng_trigger_serialize(const struct lttng_trigger *, struct lttng_payload *): Assertion `(creds->uid).is_set' failed.

Program terminated with signal SIGABRT, Aborted.
 #0  0x00007fb74129eef5 in raise () from /usr/lib/libc.so.6
 #1  0x00007fb741288862 in abort () from /usr/lib/libc.so.6
 #2  0x00007fb741288747 in __assert_fail_base.cold () from /usr/lib/libc.so.6
 #3  0x00007fb741297646 in __assert_fail () from /usr/lib/libc.so.6
 #4  0x00007fb74169bab7 in lttng_trigger_serialize (trigger=0x5616f6f70060, payload=0x7ffe5819d140) at trigger.c:302
 #5  0x00007fb74169cef0 in lttng_trigger_copy (trigger=0x5616f6f70060) at trigger.c:859
 #6  0x00007fb74164302e in lttng_unregister_trigger (trigger=0x5616f6f70060) at lttng-ctl.c:3350
 #7  0x00005616f50c675f in register_named_trigger () at trigger_name.c:295
 #8  0x00005616f50c6879 in main (argc=1, argv=0x7ffe581a07d8) at trigger_name.c:343

Cause
=====

When creating a trigger instance and using it to unregister an existing
trigger, its credentials are unset (meaning 'default'). Expecting this,
lttng_unregister_trigger() copies the source trigger to change its
credentials to those of the caller.

Unfortunately, the trigger copy operation expects credentials to be set.

We don't run into this situation typically since the trigger instance
used to perform the unregistration is sourced from a listing or is the
same instance that was used to perform the registration (which sets the
credentials before serializing).

Solution
========

A proper implementation of "copy" is provided for the trigger object
itself. For its condition and action, we still use the same "trick"
of leveraging the serdes code to perform a deep-copy, keeping the change
small

Drawbacks
=========

None really, except that we lose some of the code sharing between
copy and serdes.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I71b7b075c959bc4935621543c4d379f62b7dabdf
jgalar pushed a commit that referenced this pull request Apr 23, 2021
Observed issue
==============

A dead lock is observed during the start-stop test suite for triggers.

Cause
=====

A start session action is executed by the action executor, the
`cmd_start_trace` function is called and effectively holds the
`session_list_lock.`. During `cmd_start_trace` a call to
`notification_thread_command_add_channel` is performed to inform the
notification thread of the new channel presence.

At the same time, a tracer event notification is received by the
notification thread. The actions are queued up and the sample of the
session id take place and a call to `session_lock_list` is performed and
blocks on the lock operation.

The notification thread wait on the `session_list_lock` and the
`session_list_lock` holder, the action executor, waits on the completion
of a command the be run by the notification thread: deadlock.

The backtrace:

 Thread 6 (Thread 0x7f831c8a6700 (LWP 3046458)):
 #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
 #1  0x000000000053b852 in futex (uaddr=0x7f831c8a45e0, op=0, val=0, timeout=0x0, uaddr2=0x0, val3=0) at /home/joraj/lttng/master/install/include/urcu/futex.h:72
 #2  0x000000000053b4f9 in futex_noasync (uaddr=0x7f831c8a45e0, op=0, val=0, timeout=0x0, uaddr2=0x0, val3=0) at /home/joraj/lttng/master/install/include/urcu/futex.h:81
 #3  0x000000000053af10 in lttng_waiter_wait (waiter=0x7f831c8a45d8) at waiter.c:55
 #4  0x000000000046b0f2 in run_command_wait (handle=0xe60520, cmd=0x7f831c8a4588) at notification-thread-commands.c:49
 #5  0x000000000046b270 in notification_thread_command_add_channel (handle=0xe60520, session_name=0x7f8300006c30 "my_triggered_session", uid=1000, gid=1000, channel_name=0x7f82dc00be04 "channel0", key=1, domain=LTTNG_DOMAIN_UST, capacity=2097152) at notification-thread-commands.c:184
 #6  0x00000000004c7f65 in create_channel_per_uid (app=0x7f82d8000bf0, usess=0x7f8300000bb0, ua_sess=0x7f82dc002600, ua_chan=0x7f82dc00bde0) at ust-app.c:3360
 #7  0x00000000004c6f98 in ust_app_channel_send (app=0x7f82d8000bf0, usess=0x7f8300000bb0, ua_sess=0x7f82dc002600, ua_chan=0x7f82dc00bde0) at ust-app.c:3514
 #8  0x00000000004c6bde in ust_app_channel_create (usess=0x7f8300000bb0, ua_sess=0x7f82dc002600, uchan=0x7f8300005a90, app=0x7f82d8000bf0, _ua_chan=0x7f831c8a48b0) at ust-app.c:4771
 #9  0x00000000004c6968 in find_or_create_ust_app_channel (usess=0x7f8300000bb0, ua_sess=0x7f82dc002600, app=0x7f82d8000bf0, uchan=0x7f8300005a90, ua_chan=0x7f831c8a48b0) at ust-app.c:5610
 #10 0x00000000004c4f09 in ust_app_synchronize_all_channels (usess=0x7f8300000bb0, ua_sess=0x7f82dc002600, app=0x7f82d8000bf0) at ust-app.c:5820
 #11 0x00000000004b958c in ust_app_synchronize (usess=0x7f8300000bb0, app=0x7f82d8000bf0) at ust-app.c:5886
 #12 0x00000000004b8500 in ust_app_global_update (usess=0x7f8300000bb0, app=0x7f82d8000bf0) at ust-app.c:5960
 #13 0x00000000004b7ec2 in ust_app_start_trace_all (usess=0x7f8300000bb0) at ust-app.c:5520
 #14 0x0000000000444e86 in cmd_start_trace (session=0x7f8300006c30) at cmd.c:2707
 #15 0x00000000004a5af9 in action_executor_start_session_handler (executor=0x7f8314004410, work_item=0x7f8314005100, item=0x7f83140050b0) at action-executor.c:342
 #16 0x00000000004a537f in action_executor_generic_handler (executor=0x7f8314004410, work_item=0x7f8314005100, item=0x7f83140050b0) at action-executor.c:696
 #17 0x00000000004a4dbc in action_work_item_execute (executor=0x7f8314004410, work_item=0x7f8314005100) at action-executor.c:715
 #18 0x00000000004a37e6 in action_executor_thread (_data=0x7f8314004410) at action-executor.c:797
 #19 0x0000000000486193 in launch_thread (data=0x7f83140044b0) at thread.c:66
 #20 0x00007f8320b60609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #21 0x00007f8320a87293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 5 (Thread 0x7f831d0a7700 (LWP 3046457)):
 #0  __lll_lock_wait (futex=futex@entry=0x5e1c10 <ltt_session_list>, private=0) at lowlevellock.c:52
 #1  0x00007f8320b630a3 in __GI___pthread_mutex_lock (mutex=0x5e1c10 <ltt_session_list>) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004378c3 in session_lock_list () at session.c:156
 #3  0x00000000004a871c in add_action_to_subitem_array (action=0x7f830001a730, subitems=0x7f83140051d0) at action-executor.c:1081
 #4  0x00000000004a8578 in add_action_to_subitem_array (action=0x7f830001a620, subitems=0x7f83140051d0) at action-executor.c:1025
 #5  0x00000000004a4922 in populate_subitem_array_from_trigger (trigger=0x7f830001a950, subitems=0x7f83140051d0) at action-executor.c:1116
 #6  0x00000000004a416e in action_executor_enqueue_trigger (executor=0x7f8314004410, trigger=0x7f830001a950, evaluation=0x7f8314005190, object_creds=0x0, client_list=0x7f8314004980) at action-executor.c:924
 #7  0x0000000000479481 in dispatch_one_event_notifier_notification (state=0x7f831d0a63e8, notification=0x7f8314005160) at notification-thread-events.c:4613
 #8  0x0000000000472324 in handle_one_event_notifier_notification (state=0x7f831d0a63e8, pipe=65, domain=LTTNG_DOMAIN_UST) at notification-thread-events.c:4702
 #9  0x0000000000472271 in handle_notification_thread_event_notification (state=0x7f831d0a63e8, pipe=65, domain=LTTNG_DOMAIN_UST) at notification-thread-events.c:4717
 #10 0x00000000004695a3 in handle_event_notification_pipe (event_source_fd=65, domain=LTTNG_DOMAIN_UST, revents=1, state=0x7f831d0a63e8) at notification-thread.c:591
 #11 0x000000000046849b in thread_notification (data=0xe60520) at notification-thread.c:727
 #12 0x0000000000486193 in launch_thread (data=0xe60610) at thread.c:66
 #13 0x00007f8320b60609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #14 0x00007f8320a87293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Solution
========

Instead of using session_find_by_name() which requires the
`session_list_lock`, we introduce `sample_session_id_by_name` that uses
a urcu backed data structure. This allows the sampling of the session
id without holding the session list lock. We accept the small window
where a session object is still accessible but concretely not valid
since the actual execution context will be validated at the moment of
execution. The execution side already handles the possibility that the
session is removed at that point or is not the same session. The
execution side acquires the session_list_lock for validation.

Known drawbacks
=========

None

Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I5ad2c57acc0d03d2814dda59f8ecf2d831fd961e
jgalar pushed a commit that referenced this pull request Apr 28, 2021
Issue observed
==============

When running the test_notification_ust_buffer_usage test on x86
(32 bit), the session daemon and test client both crash. The session
daemon dies while attempting to lock a NULL client list during the
execution of an enqueued action in the action executor.

See the following backtrace:

 #0  0xf7c6c756 in __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:67
 #1  0x565afe96 in notification_client_list_send_evaluation (client_list=0x0, trigger=0xf0f225e0, evaluation=0xf330c830, source_object_creds=0xf330e5cc, client_report=0x565cf81b <client_handle_transmission_status>, user_data=0xf330c320) at notification-thread-events.c:4372
 #2  0x565cfb41 in action_executor_notify_handler (executor=0xf330c320, work_item=0xf330e5b0, item=0xf330c7b0) at action-executor.c:269
 #3  0x565d1a58 in action_executor_generic_handler (executor=0xf330c320, work_item=0xf330e5b0, item=0xf330c7b0) at action-executor.c:696
 #4  0x565d1b7f in action_work_item_execute (executor=0xf330c320, work_item=0xf330e5b0) at action-executor.c:715
 #5  0x565d212f in action_executor_thread (_data=0xf330c320) at action-executor.c:797
 #6  0x565b9d0e in launch_thread (data=0xf330c390) at thread.c:66
 #7  0xf7c69fd2 in start_thread (arg=<optimized out>) at pthread_create.c:486
 #8  0xf7b7f6d6 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108

This crash causes an assertion to fail in the test client; checking for
data pending was not expected to return a negative value. In this case,
the negative return value is justified as it is -LTTNG_ERR_NO_SESSIOND.

Cause
=====

Equipped with coffee, a debugger, and a healthy dose of print
statements, it appeared that the following was taking place:

- Register a trigger (T1): high buffer usage (0.99) -> notify (succeeds)
- Subscribe to high buffer usage (0.99) notifications (succeeds)
- Subscribe to high buffer usage (0.99) notifications
  (fails duplicate, expected)
- Unregister trigger (fails unexpectedly)
- Notification client destroys its channel, causing the condition to be
  unsubscribed-from

- Another test registers a trigger (T2): high buffer usage (0.90) ->
  notify (succeeds)
- Session daemon evaluates a channel sample against T1's condition,
  which evaluates to true and produces an "evaluation" to send to
  clients
- The client list associated to T1's condition is not found (but this
  isn't checked)
- An action executor work item is queued to run T1's actions (notify),
  but without a client list, resulting in the crash when it is executed.

We could confirm that the client list associated to T1's condition was
created and never destroyed making the failure to find it rather
puzzling.

It turns out that the hash of T1's condition did not match the hash of
the client list's condition. This is unexpected as both conditions are
copies of one another.

It turns out that, on x86, the scheme being used to transmit the
condition's buffer usage threshold floating point value is not compiled
to numerically stable code. Serializing such a buffer condition and
creating it from the resulting payload in a loop showed that the
threshold value gradually drifted. This isn't the case on the other
architectures we support.

On x86-64, gcc makes use of SSE instructions to perform the conversion
to an integral value (with double precision). However, on x86, it makes
use of the x87 fpu stack instructions which carry 80-bit of precision
internally, resulting in a loss of precision as the value is
transformed, back and forth, between 80-bit to double precision
representations.

Solution
========

Since conditions are not carried between hosts (only between clients
and the session daemon), a fixed-point conversion scheme is unnecessary.
The 'double' value provided by the client is carried directly which
bypasses the problem completely.

Drawbacks
=========

None.

Signed-off-by: Francis Deslauriers <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ie524e7362626406327f4f56e1dba5c8cf469df31
jgalar pushed a commit that referenced this pull request Apr 28, 2021
Issue observed
==============

When running the test_notification_ust_buffer_usage test on x86
(32 bit), the session daemon and test client both crash. The session
daemon dies while attempting to lock a NULL client list during the
execution of an enqueued action in the action executor.

See the following backtrace:

 #0  0xf7c6c756 in __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:67
 #1  0x565afe96 in notification_client_list_send_evaluation (client_list=0x0, trigger=0xf0f225e0, evaluation=0xf330c830, source_object_creds=0xf330e5cc, client_report=0x565cf81b <client_handle_transmission_status>, user_data=0xf330c320) at notification-thread-events.c:4372
 #2  0x565cfb41 in action_executor_notify_handler (executor=0xf330c320, work_item=0xf330e5b0, item=0xf330c7b0) at action-executor.c:269
 #3  0x565d1a58 in action_executor_generic_handler (executor=0xf330c320, work_item=0xf330e5b0, item=0xf330c7b0) at action-executor.c:696
 #4  0x565d1b7f in action_work_item_execute (executor=0xf330c320, work_item=0xf330e5b0) at action-executor.c:715
 #5  0x565d212f in action_executor_thread (_data=0xf330c320) at action-executor.c:797
 #6  0x565b9d0e in launch_thread (data=0xf330c390) at thread.c:66
 #7  0xf7c69fd2 in start_thread (arg=<optimized out>) at pthread_create.c:486
 #8  0xf7b7f6d6 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108

This crash causes an assertion to fail in the test client; checking for
data pending was not expected to return a negative value. In this case,
the negative return value is justified as it is -LTTNG_ERR_NO_SESSIOND.

Cause
=====

Equipped with coffee, a debugger, and a healthy dose of print
statements, it appeared that the following was taking place:

- Register a trigger (T1): high buffer usage (0.99) -> notify (succeeds)
- Subscribe to high buffer usage (0.99) notifications (succeeds)
- Subscribe to high buffer usage (0.99) notifications
  (fails duplicate, expected)
- Unregister trigger (fails unexpectedly)
- Notification client destroys its channel, causing the condition to be
  unsubscribed-from

- Another test registers a trigger (T2): high buffer usage (0.90) ->
  notify (succeeds)
- Session daemon evaluates a channel sample against T1's condition,
  which evaluates to true and produces an "evaluation" to send to
  clients
- The client list associated to T1's condition is not found (but this
  isn't checked)
- An action executor work item is queued to run T1's actions (notify),
  but without a client list, resulting in the crash when it is executed.

We could confirm that the client list associated to T1's condition was
created and never destroyed making the failure to find it rather
puzzling.

It turns out that the hash of T1's condition did not match the hash of
the client list's condition. This is unexpected as both conditions are
copies of one another.

It turns out that, on x86, the scheme being used to transmit the
condition's buffer usage threshold floating point value is not compiled
to numerically stable code. Serializing such a buffer condition and
creating it from the resulting payload in a loop showed that the
threshold value gradually drifted. This isn't the case on the other
architectures we support.

On x86-64, gcc makes use of SSE instructions to perform the conversion
to an integral value (with double precision). However, on x86, it makes
use of the x87 fpu stack instructions which carry 80-bit of precision
internally, resulting in a loss of precision as the value is
transformed, back and forth, between 80-bit to double precision
representations.

Solution
========

Since conditions are not carried between hosts (only between clients
and the session daemon), a fixed-point conversion scheme is unnecessary.
The 'double' value provided by the client is carried directly which
bypasses the problem completely.

Drawbacks
=========

None.

Signed-off-by: Francis Deslauriers <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ie524e7362626406327f4f56e1dba5c8cf469df31
jgalar pushed a commit that referenced this pull request May 11, 2021
When running

  $ lttng add-trigger --condition on-event -u ust_tests_demo2:loop --capture intfield --action notify

I get the leaks pasted below. It seems like filter_parser_ctx_free
doesn't free everything in filter_parser_ctx. Add what's missing.
Re-order the frees so that they are in the same order as the members of
the struct, just because it's easier to follow and make sure we didn't
forget anything.

=================================================================
==1073803==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a3d61 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:667
    #4 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a1b94 in visit_node_load_expression_legacy /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:198
    #4 0x5555556a1d18 in visit_node_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:231
    #5 0x5555556a2540 in visit_node_load /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:399
    #6 0x5555556a3a8b in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:622
    #7 0x5555556a12fa in visit_node_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:53
    #8 0x5555556a3a76 in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:620
    #9 0x5555556a3c55 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:661
    #10 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #11 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #12 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #13 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #14 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #15 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #16 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #17 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #18 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a3dd2 in make_op_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:35
    #2 0x5555556a73a5 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:874
    #3 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #4 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4f1d in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:280
    #2 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #3 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #4 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #5 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #6 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #7 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #8 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #9 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #10 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #11 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #12 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #13 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #14 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a484d in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:201
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4e64 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:262
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4bbc in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:233
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 9 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff761fa69 in __interceptor_strdup /build/gcc/src/gcc/libsanitizer/asan/asan_interceptors.cpp:452
    #1 0x5555556a4c41 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:238
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4829 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:196
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

SUMMARY: AddressSanitizer: 409 byte(s) leaked in 9 allocation(s).

Signed-off-by: Simon Marchi <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ic338ea1689d3f002bf9cade6d4f23e62d935968b
jgalar pushed a commit that referenced this pull request May 11, 2021
When running

  $ lttng add-trigger --condition on-event -u ust_tests_demo2:loop --capture intfield --action notify

I get the leaks pasted below. It seems like filter_parser_ctx_free
doesn't free everything in filter_parser_ctx. Add what's missing.
Re-order the frees so that they are in the same order as the members of
the struct, just because it's easier to follow and make sure we didn't
forget anything.

=================================================================
==1073803==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a3d61 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:667
    #4 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff767783a in __interceptor_realloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x5555556833be in bytecode_reserve /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:59
    #2 0x55555568360f in bytecode_push /home/simark/src/lttng-tools/src/common/bytecode/bytecode.c:79
    #3 0x5555556a1b94 in visit_node_load_expression_legacy /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:198
    #4 0x5555556a1d18 in visit_node_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:231
    #5 0x5555556a2540 in visit_node_load /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:399
    #6 0x5555556a3a8b in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:622
    #7 0x5555556a12fa in visit_node_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:53
    #8 0x5555556a3a76 in recursive_visit_gen_bytecode /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:620
    #9 0x5555556a3c55 in filter_visitor_bytecode_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-bytecode.c:661
    #10 0x55555569c9b1 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:394
    #11 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #12 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #13 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #14 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #15 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #16 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #17 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #18 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a3dd2 in make_op_root /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:35
    #2 0x5555556a73a5 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:874
    #3 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #4 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #5 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #6 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #7 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #8 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #9 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #10 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #11 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #12 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4f1d in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:280
    #2 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #3 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #4 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #5 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #6 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #7 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #8 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #9 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #10 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #11 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #12 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #13 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #14 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a484d in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:201
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4e64 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:262
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4bbc in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:233
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 9 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff761fa69 in __interceptor_strdup /build/gcc/src/gcc/libsanitizer/asan/asan_interceptors.cpp:452
    #1 0x5555556a4c41 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:238
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7ffff7677639 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x5555556a4829 in create_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:196
    #2 0x5555556a5040 in make_op_load_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:287
    #3 0x5555556a696f in make_expression /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:637
    #4 0x5555556a73df in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:882
    #5 0x5555556a7382 in generate_ir_recursive /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:870
    #6 0x5555556a74d6 in filter_visitor_ir_generate /home/simark/src/lttng-tools/src/common/filter/filter-visitor-generate-ir.c:903
    #7 0x55555569c859 in filter_parser_ctx_create_from_filter_expression /home/simark/src/lttng-tools/src/common/filter/filter-parser.y:353
    #8 0x55555560542e in parse_event_rule /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:704
    #9 0x555555607429 in handle_condition_event /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1088
    #10 0x555555608760 in parse_condition /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1326
    #11 0x55555560bca0 in cmd_add_trigger /home/simark/src/lttng-tools/src/bin/lttng/commands/add_trigger.c:1925
    #12 0x555555616b55 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
    #13 0x555555617516 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:421
    #14 0x555555617812 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:470
    #15 0x7ffff700bb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

SUMMARY: AddressSanitizer: 409 byte(s) leaked in 9 allocation(s).

Signed-off-by: Simon Marchi <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ic338ea1689d3f002bf9cade6d4f23e62d935968b
jgalar added a commit that referenced this pull request Apr 12, 2022
Observed issue
==============

When servicing a large number of tracer notifications and sending
notifications to clients, the session daemon occasionally hits
an assertion:

  #4  0x00007fb224d7d116 in __assert_fail () from /usr/lib/libc.so.6
  #5  0x000056038b2fe4d7 in client_flush_outgoing_queue (client=0x7fb21400c3b0) at notification-thread-events.cpp:3586
  #6  0x000056038b2ff819 in handle_notification_thread_client_out (state=0x7fb221974090, socket=77) at notification-thread-events.cpp:4104
  #7  0x000056038b2f3d77 in thread_notification (data=0x56038cc7fe90) at notification-thread.cpp:763
  #8  0x000056038b30ca7d in launch_thread (data=0x56038cc7e220) at thread.cpp:66
  #9  0x00007fb224dcf5c2 in start_thread () from /usr/lib/libc.so.6
  #10 0x00007fb224e54584 in clone () from /usr/lib/libc.so.6

Cause
=====

A client "out" event can be received when no payload is left
to send under some circumstances.

Many threads can flush a client's outgoing queue and, if they
had to queue their message (socket was full), will use the
"communication update" command to signal the (e)poll thread
to monitor for space being made available in the socket.

Commands are sent over an internal pipe serviced by the same
thread as the client sockets.

When space is made available in the socket, there is a race
between the (e)poll thread and the other threads that may
wish to use the client's socket to flush its outgoing queue.

A non-(e)poll thread may attempt (and succeed) in flushing
the queue before the (e)poll thread gets a chance to service
the client's "out" event.

In this situation, the (e)poll thread processing the client
out event will see an empty payload: there is nothing to do.

Solution
========

The (e)poll thread can simply ignore the "client out" event
when an empty payload is seen.

There is also no need to update the transmission status as
the other thread has already enqueued a "communication
update" command to do so.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I8a181bea1e37e8e14cc67b624b76d139b488eded
jgalar added a commit that referenced this pull request Apr 25, 2022
Issue observed
==============

Address sanitizer reports the following invalid accesses while running
the test_mi test.

❯ ASAN_OPTIONS=detect_odr_violation=0 lttng-sessiond
=================================================================
==289173==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60400000e280 at pc 0x55cbbe35e2e0 bp 0x7f01672f1550 sp 0x7f01672f1540
WRITE of size 4 at 0x60400000e280 thread T13
    #0 0x55cbbe35e2df in mark_thread_as_ready /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32
    #1 0x55cbbe360160 in thread_consumer_management /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:267
    #2 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #3 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)
    #4 0x7f0172a46583 in __clone (/usr/lib/libc.so.6+0x112583)

0x60400000e280 is located 8 bytes to the right of 40-byte region [0x60400000e250,0x60400000e278)
allocated by thread T7 here:
    #0 0x7f01733b1fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55cbbe33adf3 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55cbbe33ae03 in thread_notifiers* zmalloc<thread_notifiers>() ../../../src/common/macros.hpp:89
    #3 0x55cbbe3617f9 in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:440
    #4 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188
    #5 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394
    #6 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277
    #7 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622
    #8 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #9 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Thread T13 created by T7 here:
    #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106
    #2 0x55cbbe3618cc in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:453
    #3 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188
    #4 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394
    #5 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277
    #6 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622
    #7 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #8 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Thread T7 created by T0 here:
    #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106
    #2 0x55cbbe34eebf in launch_client_thread() /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2756
    #3 0x55cbbe27f31a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1838
    #4 0x7f017296130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32 in mark_thread_as_ready
Shadow bytes around the buggy address:
  0x0c087fff9c00: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c10: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c20: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c30: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c40: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 fa
=>0x0c087fff9c50:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9ca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==289173==ABORTING

Cause
=====

The start functions of the various worker threads of the session daemon
are implemented in separate translation units (TU). To make use of the
lttng_thread API, they all define different control structures to
control their shutdown.

Those structures are all named 'thread_notifiers' and are all allocated
using zmalloc<>. The various instances of zmalloc<thread_notifiers> all
end up having the same mangled name (e.g.
_Z7zmallocI16thread_notifiersEPT_v).

At link time, only one instance of zmalloc<thread_notifiers> is kept.
Since those structures all have different layout/sizes, this is
problematic. However, it is an acceptable behaviour according to the ODR
[1].

I first considered making the various memory allocation functions in
macros.hpp 'static' which results in each TU holding the appropriate
specialization of the various functions. While this works, it doesn't
make us ODR-compliant. To make a long story short, a program defining
multiple types sharing the same name, in the same namespace, is
ill-formed.

Another concern is that marking all templated free-functions as static
will eventually result in code bloat.

Solution
========

All structures defined in TUs (but not in a header) are placed in
unnamed namespaces (also called anonymous namespaces) [2].

This results in separate copies of the templated functions being
generated when specialized using a structure in an anonymous
namespace (e.g. _Z7zmallocIN12_GLOBAL__N_116thread_notifiersEEPT_v).

We could have renamed the various `thread_notifiers` structures to give
them different names. However, I found those are not the only structures
sharing a name in different TUs. For instance, the same problem applies
to `struct lttng_index` (index in a stream, index in a map).

I propose we systematically namespace structures defined in TUs in the
future.

This will also save us trouble if those POD structures eventually become
non-POD: we would experience the same "clashes" if those structures had
constructors, for example.

References
==========

[1] https://en.cppreference.com/w/cpp/language/definition
[2] https://en.cppreference.com/w/cpp/language/namespace

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I867e5a287ad8cf3ada617335bc1a80b800bf0833
jgalar added a commit that referenced this pull request Apr 25, 2022
LeakSanitizer reports the following leak:

==974957==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb86fcd1b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x7fdb86d7c296 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x7fdb86d7c060 in lttng_dynamic_buffer_set_size(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:112
    #3 0x7fdb86d2589a in recv_payload_sessiond /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:230
    #4 0x7fdb86d26fa5 in lttng_ctl_ask_sessiond_payload(lttng_payload_view*, lttng_payload*) /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:662
    #5 0x7fdb86d2cd8d in lttng_list_tracepoint_fields /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:1767
    #6 0x56481623cb4c in list_ust_event_fields commands/list.cpp:850
    #7 0x5648162448d9 in cmd_list(int, char const**) commands/list.cpp:2394
    #8 0x56481628fb3e in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #9 0x564816290601 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #10 0x564816290908 in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #11 0x7fdb8661730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 32 byte(s) leaked in 1 allocation(s).

The session daemon's reply is indeed never released in
lttng_list_tracepoint_fields.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Idd244b52a69f3b74e5c131c1c36c6ee6d76f4285
jgalar added a commit that referenced this pull request Apr 25, 2022
==1175545==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8696 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707ddc6004 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707ddceb17 in ltt_ust_session* zmalloc<ltt_ust_session>() ../../../src/common/macros.hpp:89
    #3 0x55707ddc81e7 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:274
    #4 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #5 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #6 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 24672 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707dee4ec1 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707def774e in consumer_output* zmalloc<consumer_output>() ../../../src/common/macros.hpp:89
    #3 0x55707dee90df in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:523
    #4 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #5 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #6 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #7 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bf985f in alloc_split_items_count /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:688
    #2 0x7efed0bf985f in _cds_lfht_new /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:1642

Indirect leak of 656 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac68 in __default_alloc_cds_lfht ../src/rculfhash-internal.h:172
    #2 0x7efed0bfac68 in alloc_cds_lfht /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:81

Indirect leak of 48 byte(s) in 2 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:35
    #2 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:28

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707de3a9af in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55707de3a9bf in lttng_ht* zmalloc<lttng_ht>() ../../src/common/macros.hpp:89
    #3 0x55707de38461 in lttng_ht_new(unsigned long, lttng_ht_type) hashtable/hashtable.cpp:113
    #4 0x55707dee9340 in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:535
    #5 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #6 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #7 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #8 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac15 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:31

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ib2ad82a197f2a4ccb86ae5799c1d93ff059888e3
jgalar added a commit that referenced this pull request Apr 25, 2022
==1198508==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c8a0 in zmalloc<(anonymous namespace)::lttng_rate_policy_once_after_n> ../../src/common/macros.hpp:89
    #3 0x55787186c173 in lttng_rate_policy_once_after_n_create actions/rate-policy.cpp:707
    #4 0x55787186a368 in lttng_rate_policy_once_after_n_create_from_payload actions/rate-policy.cpp:183
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871865b5b in test_rate_policy_once_after_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:231
    #7 0x557871865dc9 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:250
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c890 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x55787186b6cd in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x55787186a699 in lttng_rate_policy_every_n_create_from_payload actions/rate-policy.cpp:220
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871864cae in test_rate_policy_every_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:122
    #7 0x557871865dc4 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:249
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 2 allocation(s).

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I3a9b4d99e93f355ddb8623a289f8397907486ab0
jgalar added a commit that referenced this pull request Apr 25, 2022
==1429021==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fe305f031b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x559f1b022238 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x559f1b021d9f in lttng_dynamic_buffer_append(lttng_dynamic_buffer*, void const*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:52
    #3 0x559f1b02144a in lttng_dynamic_array_add_element(lttng_dynamic_array*, void const*) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-array.cpp:58
    #4 0x559f1b07d07b in lttng_action_path_copy(lttng_action_path const*, lttng_action_path*) actions/path.cpp:116
    #5 0x559f1b02383f in lttng_error_query_action_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:232
    #6 0x559f1b02760e in lttng_error_query_create_from_payload(lttng_payload_view*, lttng_error_query**) /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:911
    #7 0x559f1af5c361 in receive_lttng_error_query /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:740
    #8 0x559f1af64eba in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2336
    #9 0x559f1af67378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #10 0x559f1af50642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #11 0x7fe3055225c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I7a6f7d2a9746124581eebf30877466f16db67a6b
jgalar added a commit that referenced this pull request Apr 25, 2022
==1480456==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb9260cfb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fdb9242348d in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x7fdb924295a9 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x7fdb92423dbe in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x56304832331f in register_trigger /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:24
    #5 0x5630483233f1 in register_trigger_action_list_notify /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:46
    #6 0x5630483239a0 in test_session_rotation_conditions /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:246
    #7 0x563048323d4d in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:309
    #8 0x7fdb91c6630f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ie163989a70f65f9c2c4e93c36cc9fc6ba6bdeeb5
jgalar added a commit that referenced this pull request Apr 25, 2022
==1501334==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 16386 byte(s) in 1 object(s) allocated from:
    #0 0x7f95efc3cdd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55acb0681ed3 in lttng_filter_yyalloc(unsigned long, void*) filter/filter-lexer.cpp:2511
    #2 0x55acb067f2f2 in lttng_filter_yy_create_buffer(_IO_FILE*, int, void*) filter/filter-lexer.cpp:1895
    #3 0x55acb067ea44 in yyrestart(_IO_FILE*, void*) filter/filter-lexer.cpp:1824
    #4 0x55acb0649a43 in filter_parser_ctx_alloc(_IO_FILE*) filter/filter-parser.ypp:271
    #5 0x55acb0649e7f in filter_parser_ctx_create_from_filter_expression(char const*, filter_parser_ctx**) filter/filter-parser.ypp:332
    #6 0x55acb058ee89 in parse_event_rule commands/add_trigger.cpp:783
    #7 0x55acb05920c0 in handle_condition_event commands/add_trigger.cpp:1361
    #8 0x55acb0592739 in parse_condition commands/add_trigger.cpp:1457
    #9 0x55acb0596b56 in cmd_add_trigger(int, char const**) commands/add_trigger.cpp:2304
    #10 0x55acb05a5b80 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #11 0x55acb05a6643 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #12 0x55acb05a694a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #13 0x7f95ef28730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I6fa21e7d066e0cf48afc3f91ceefbfd19c6b86fd
jgalar added a commit that referenced this pull request Apr 25, 2022
==1769573==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fef37a29fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fef37792f2f in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x7fef3779573a in lttng_rotation_schedules* zmalloc<lttng_rotation_schedules>() ../../../src/common/macros.hpp:89
    #3 0x7fef377947cc in lttng_rotation_schedules_create /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:353
    #4 0x7fef37794aa0 in get_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:392
    #5 0x7fef377956dc in lttng_session_list_rotation_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:665
    #6 0x5646131713f2 in test_add_list_remove_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:252
    #7 0x56461317157b in test_add_list_remove_size_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:270
    #8 0x564613171680 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:307
    #9 0x7fef373ae30f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9b7eb537d158791db76f9a7676ffeb5d4a1f2203
jgalar added a commit that referenced this pull request Apr 25, 2022
==1801304==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 224 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb64175 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb6a291 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x559fbeb64aa6 in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x559fbe9dc417 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:87
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 208 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb16e21 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb16e31 in lttng_action_notify* zmalloc<lttng_action_notify>() ../../src/common/macros.hpp:89
    #3 0x559fbeb168a0 in lttng_action_notify_create actions/notify.cpp:135
    #4 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 160 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb3d7a1 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb3fa35 in lttng_condition_session_consumed_size* zmalloc<lttng_condition_session_consumed_size>() ../../src/common/macros.hpp:89
    #3 0x559fbeb3e6fd in lttng_condition_session_consumed_size_create conditions/session-consumed-size.cpp:206
    #4 0x559fbe9dc0f1 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:54
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 112 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb242ad in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb27062 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x559fbeb25e9f in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x559fbeb168b9 in lttng_action_notify_create actions/notify.cpp:141
    #5 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #6 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #7 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #8 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #9 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #10 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 34 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e19319 in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x559fbeb3f603 in lttng_condition_session_consumed_size_set_session_name conditions/session-consumed-size.cpp:442
    #2 0x559fbe9dc2c4 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:71
    #3 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #4 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #5 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #6 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #7 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

The rotation trigger of a session (used for size-based rotations) is
never cleaned-up. It is now cleaned up every time its condition is
hit and whenever the session is destroyed.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I5a89341535f87b7851b548ded9838c18bd1ccb95
jgalar added a commit that referenced this pull request Jun 15, 2022
LeakSanitizer reports the following leak:

==974957==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb86fcd1b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x7fdb86d7c296 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x7fdb86d7c060 in lttng_dynamic_buffer_set_size(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:112
    #3 0x7fdb86d2589a in recv_payload_sessiond /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:230
    #4 0x7fdb86d26fa5 in lttng_ctl_ask_sessiond_payload(lttng_payload_view*, lttng_payload*) /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:662
    #5 0x7fdb86d2cd8d in lttng_list_tracepoint_fields /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:1767
    #6 0x56481623cb4c in list_ust_event_fields commands/list.cpp:850
    #7 0x5648162448d9 in cmd_list(int, char const**) commands/list.cpp:2394
    #8 0x56481628fb3e in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #9 0x564816290601 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #10 0x564816290908 in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #11 0x7fdb8661730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 32 byte(s) leaked in 1 allocation(s).

The session daemon's reply is indeed never released in
lttng_list_tracepoint_fields.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Idd244b52a69f3b74e5c131c1c36c6ee6d76f4285
jgalar added a commit that referenced this pull request Jun 15, 2022
==1175545==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8696 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707ddc6004 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707ddceb17 in ltt_ust_session* zmalloc<ltt_ust_session>() ../../../src/common/macros.hpp:89
    #3 0x55707ddc81e7 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:274
    #4 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #5 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #6 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 24672 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707dee4ec1 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707def774e in consumer_output* zmalloc<consumer_output>() ../../../src/common/macros.hpp:89
    #3 0x55707dee90df in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:523
    #4 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #5 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #6 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #7 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bf985f in alloc_split_items_count /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:688
    #2 0x7efed0bf985f in _cds_lfht_new /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:1642

Indirect leak of 656 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac68 in __default_alloc_cds_lfht ../src/rculfhash-internal.h:172
    #2 0x7efed0bfac68 in alloc_cds_lfht /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:81

Indirect leak of 48 byte(s) in 2 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:35
    #2 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:28

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707de3a9af in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55707de3a9bf in lttng_ht* zmalloc<lttng_ht>() ../../src/common/macros.hpp:89
    #3 0x55707de38461 in lttng_ht_new(unsigned long, lttng_ht_type) hashtable/hashtable.cpp:113
    #4 0x55707dee9340 in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:535
    #5 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #6 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #7 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #8 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac15 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:31

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ib2ad82a197f2a4ccb86ae5799c1d93ff059888e3
jgalar added a commit that referenced this pull request Jun 15, 2022
==1198508==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c8a0 in zmalloc<(anonymous namespace)::lttng_rate_policy_once_after_n> ../../src/common/macros.hpp:89
    #3 0x55787186c173 in lttng_rate_policy_once_after_n_create actions/rate-policy.cpp:707
    #4 0x55787186a368 in lttng_rate_policy_once_after_n_create_from_payload actions/rate-policy.cpp:183
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871865b5b in test_rate_policy_once_after_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:231
    #7 0x557871865dc9 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:250
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c890 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x55787186b6cd in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x55787186a699 in lttng_rate_policy_every_n_create_from_payload actions/rate-policy.cpp:220
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871864cae in test_rate_policy_every_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:122
    #7 0x557871865dc4 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:249
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 2 allocation(s).

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I3a9b4d99e93f355ddb8623a289f8397907486ab0
jgalar added a commit that referenced this pull request Jun 15, 2022
==1429021==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fe305f031b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x559f1b022238 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x559f1b021d9f in lttng_dynamic_buffer_append(lttng_dynamic_buffer*, void const*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:52
    #3 0x559f1b02144a in lttng_dynamic_array_add_element(lttng_dynamic_array*, void const*) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-array.cpp:58
    #4 0x559f1b07d07b in lttng_action_path_copy(lttng_action_path const*, lttng_action_path*) actions/path.cpp:116
    #5 0x559f1b02383f in lttng_error_query_action_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:232
    #6 0x559f1b02760e in lttng_error_query_create_from_payload(lttng_payload_view*, lttng_error_query**) /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:911
    #7 0x559f1af5c361 in receive_lttng_error_query /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:740
    #8 0x559f1af64eba in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2336
    #9 0x559f1af67378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #10 0x559f1af50642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #11 0x7fe3055225c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I7a6f7d2a9746124581eebf30877466f16db67a6b
jgalar added a commit that referenced this pull request Jun 15, 2022
==1480456==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb9260cfb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fdb9242348d in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x7fdb924295a9 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x7fdb92423dbe in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x56304832331f in register_trigger /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:24
    #5 0x5630483233f1 in register_trigger_action_list_notify /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:46
    #6 0x5630483239a0 in test_session_rotation_conditions /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:246
    #7 0x563048323d4d in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:309
    #8 0x7fdb91c6630f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ie163989a70f65f9c2c4e93c36cc9fc6ba6bdeeb5
jgalar added a commit that referenced this pull request Jun 15, 2022
==1501334==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 16386 byte(s) in 1 object(s) allocated from:
    #0 0x7f95efc3cdd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55acb0681ed3 in lttng_filter_yyalloc(unsigned long, void*) filter/filter-lexer.cpp:2511
    #2 0x55acb067f2f2 in lttng_filter_yy_create_buffer(_IO_FILE*, int, void*) filter/filter-lexer.cpp:1895
    #3 0x55acb067ea44 in yyrestart(_IO_FILE*, void*) filter/filter-lexer.cpp:1824
    #4 0x55acb0649a43 in filter_parser_ctx_alloc(_IO_FILE*) filter/filter-parser.ypp:271
    #5 0x55acb0649e7f in filter_parser_ctx_create_from_filter_expression(char const*, filter_parser_ctx**) filter/filter-parser.ypp:332
    #6 0x55acb058ee89 in parse_event_rule commands/add_trigger.cpp:783
    #7 0x55acb05920c0 in handle_condition_event commands/add_trigger.cpp:1361
    #8 0x55acb0592739 in parse_condition commands/add_trigger.cpp:1457
    #9 0x55acb0596b56 in cmd_add_trigger(int, char const**) commands/add_trigger.cpp:2304
    #10 0x55acb05a5b80 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #11 0x55acb05a6643 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #12 0x55acb05a694a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #13 0x7f95ef28730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I6fa21e7d066e0cf48afc3f91ceefbfd19c6b86fd
jgalar added a commit that referenced this pull request Jun 15, 2022
==1769573==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fef37a29fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fef37792f2f in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x7fef3779573a in lttng_rotation_schedules* zmalloc<lttng_rotation_schedules>() ../../../src/common/macros.hpp:89
    #3 0x7fef377947cc in lttng_rotation_schedules_create /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:353
    #4 0x7fef37794aa0 in get_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:392
    #5 0x7fef377956dc in lttng_session_list_rotation_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:665
    #6 0x5646131713f2 in test_add_list_remove_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:252
    #7 0x56461317157b in test_add_list_remove_size_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:270
    #8 0x564613171680 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:307
    #9 0x7fef373ae30f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9b7eb537d158791db76f9a7676ffeb5d4a1f2203
jgalar added a commit that referenced this pull request Jun 15, 2022
==1801304==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 224 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb64175 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb6a291 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x559fbeb64aa6 in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x559fbe9dc417 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:87
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 208 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb16e21 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb16e31 in lttng_action_notify* zmalloc<lttng_action_notify>() ../../src/common/macros.hpp:89
    #3 0x559fbeb168a0 in lttng_action_notify_create actions/notify.cpp:135
    #4 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 160 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb3d7a1 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb3fa35 in lttng_condition_session_consumed_size* zmalloc<lttng_condition_session_consumed_size>() ../../src/common/macros.hpp:89
    #3 0x559fbeb3e6fd in lttng_condition_session_consumed_size_create conditions/session-consumed-size.cpp:206
    #4 0x559fbe9dc0f1 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:54
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 112 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb242ad in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb27062 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x559fbeb25e9f in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x559fbeb168b9 in lttng_action_notify_create actions/notify.cpp:141
    #5 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #6 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #7 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #8 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #9 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #10 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 34 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e19319 in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x559fbeb3f603 in lttng_condition_session_consumed_size_set_session_name conditions/session-consumed-size.cpp:442
    #2 0x559fbe9dc2c4 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:71
    #3 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #4 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #5 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #6 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #7 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

The rotation trigger of a session (used for size-based rotations) is
never cleaned-up. It is now cleaned up every time its condition is
hit and whenever the session is destroyed.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I5a89341535f87b7851b548ded9838c18bd1ccb95
jgalar added a commit that referenced this pull request Jun 22, 2022
==1175545==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8696 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707ddc6004 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707ddceb17 in ltt_ust_session* zmalloc<ltt_ust_session>() ../../../src/common/macros.hpp:89
    #3 0x55707ddc81e7 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:274
    #4 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #5 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #6 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 24672 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707dee4ec1 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707def774e in consumer_output* zmalloc<consumer_output>() ../../../src/common/macros.hpp:89
    #3 0x55707dee90df in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:523
    #4 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #5 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #6 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #7 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bf985f in alloc_split_items_count /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:688
    #2 0x7efed0bf985f in _cds_lfht_new /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:1642

Indirect leak of 656 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac68 in __default_alloc_cds_lfht ../src/rculfhash-internal.h:172
    #2 0x7efed0bfac68 in alloc_cds_lfht /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:81

Indirect leak of 48 byte(s) in 2 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:35
    #2 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:28

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707de3a9af in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55707de3a9bf in lttng_ht* zmalloc<lttng_ht>() ../../src/common/macros.hpp:89
    #3 0x55707de38461 in lttng_ht_new(unsigned long, lttng_ht_type) hashtable/hashtable.cpp:113
    #4 0x55707dee9340 in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:535
    #5 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #6 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #7 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #8 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac15 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:31

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: Ib2ad82a197f2a4ccb86ae5799c1d93ff059888e3
jgalar added a commit that referenced this pull request Jun 22, 2022
==1769573==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fef37a29fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fef37792f2f in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x7fef3779573a in lttng_rotation_schedules* zmalloc<lttng_rotation_schedules>() ../../../src/common/macros.hpp:89
    #3 0x7fef377947cc in lttng_rotation_schedules_create /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:353
    #4 0x7fef37794aa0 in get_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:392
    #5 0x7fef377956dc in lttng_session_list_rotation_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:665
    #6 0x5646131713f2 in test_add_list_remove_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:252
    #7 0x56461317157b in test_add_list_remove_size_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:270
    #8 0x564613171680 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:307
    #9 0x7fef373ae30f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9b7eb537d158791db76f9a7676ffeb5d4a1f2203
jgalar added a commit that referenced this pull request Jun 22, 2022
==1801304==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 224 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb64175 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb6a291 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x559fbeb64aa6 in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x559fbe9dc417 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:87
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 208 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb16e21 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb16e31 in lttng_action_notify* zmalloc<lttng_action_notify>() ../../src/common/macros.hpp:89
    #3 0x559fbeb168a0 in lttng_action_notify_create actions/notify.cpp:135
    #4 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 160 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb3d7a1 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb3fa35 in lttng_condition_session_consumed_size* zmalloc<lttng_condition_session_consumed_size>() ../../src/common/macros.hpp:89
    #3 0x559fbeb3e6fd in lttng_condition_session_consumed_size_create conditions/session-consumed-size.cpp:206
    #4 0x559fbe9dc0f1 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:54
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 112 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb242ad in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb27062 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x559fbeb25e9f in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x559fbeb168b9 in lttng_action_notify_create actions/notify.cpp:141
    #5 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #6 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #7 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #8 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #9 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #10 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 34 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e19319 in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x559fbeb3f603 in lttng_condition_session_consumed_size_set_session_name conditions/session-consumed-size.cpp:442
    #2 0x559fbe9dc2c4 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:71
    #3 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #4 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #5 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #6 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #7 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

The rotation trigger of a session (used for size-based rotations) is
never cleaned-up. It is now cleaned up every time its condition is
hit and whenever the session is destroyed.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I5a89341535f87b7851b548ded9838c18bd1ccb95
jgalar pushed a commit that referenced this pull request Oct 5, 2022
Observed issue
==============

While a snapshot is being taken, the containing folder can disappear
unexpectedly. This can lead to the following errors, which are expected
and mostly handled fine:

PERROR - 14:47:32.002564464 [2922498/2922507]: Failed to open file relative to trace chunk file_path = "channel0_0", flags = 577, mode = 432: No such file or directory (in _lttng_trace_chunk_open_fs_handle_locked() at trace-chunk.cpp:1411)
Error: Failed to open stream file "channel0_0"
Error: Snapshot channel failed

The problem happens on the subsequent snapshot for the session:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007fbbdadb3859 in __GI_abort () at abort.c:79
 #2  0x00007fbbdadb3729 in __assert_fail_base (fmt=0x7fbbdaf49588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-co
 #3  0x00007fbbdadc5006 in __GI___assert_fail (assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-consumer/kernel-consumer.cpp", line=188, function=0x55c4212cfb00 "
 #4  0x000055c421268cc6 in lttng_kconsumer_snapshot_channel (channel=0x7fbbc4000b60, key=1, path=0x7fbbd37f8fd4 "", relayd_id=18446744073709551615, nb_packets_per_stream=0) at kernel-consume
 #5  0x000055c42126b39d in lttng_kconsumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at kernel-consumer/kernel-consumer.cpp:986
 #6  0x000055c4212546d1 in lttng_consumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at consumer/consumer.cpp:2090
 #7  0x000055c421259963 in consumer_thread_sessiond_poll (data=0x55c421b80a90) at consumer/consumer.cpp:3281
 #8  0x00007fbbdaf8b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007fbbdaeb0163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

How to reproduce:

 1. Setting a breakpoint on snapshot_channel() inside
    src/common/ust-consumer/ust-consumer.cpp
 2. When the breakpoint hits, remove the the complete lttng directory
    containing the session data.
 3. Continue the lttng_consumerd process from gdb.
 4. In that case you see a negative return value -1 from
    consumer_stream_create_output_files() inside snapshot_channel().
 5. Take another snapshot and lttng_consumerd crashes because
    of the `assert(!stream->trace_chunk)` in snapshot_channel().

    This last action does not require any breakpoint intervention.

Cause
=====

During the snapshot, the stream is assigned the channel current chunk.
It is expected that the stream does not have a chunk at this point.

The error handling is faulty here, the stream chunk must be
invalidated/reset on error to allow its reuse later on.

The problem exists for both consumer domains (user/kernel).

Solution
========

For the ust consumer, we can directly use the `error_close_stream`
label.

For the kernel consumer, the code path is slightly different since it
does not uses `consumer_stream_close`. Note that `consumer_stream_close`
cannot be used as is for the kernel consumer. The current implementation
partially resembles `consumer_stream_close` at the end of the iteration.
It is extracted to its own function for easier reuse from the new
`error_finalize_stream` label.

Known drawbacks
=========

None.

Fixes: #1352

Signed-off-by: Marcel Hamer <[email protected]>
Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9fc81917b19aa436ed8e8679672648f2d5baf41a
jgalar pushed a commit that referenced this pull request Nov 14, 2022
When converting msgpack objects to their event_field_value equivalent,
the following assertion fails: LTTNG_ASSERT(val);

 #4  0x00007f1f65349486 in __assert_fail () from /usr/lib/libc.so.6
 #5  0x00007f1f65584da7 in lttng_event_field_value_string_create_with_size (val=0x0, size=0) at event-field-value.cpp:186
 #6  0x00007f1f65576a1a in event_field_value_from_obj (obj=0x557f597ccdb8, field_val=0x7ffcc9675dd0)
     at conditions/event-rule-matches.cpp:1120
 #7  0x00007f1f65577176 in event_field_value_from_capture_payload (condition=0x557f597c8520,
     capture_payload=0x557f597c825b "\221\240", capture_payload_size=2) at conditions/event-rule-matches.cpp:1340
 #8  0x00007f1f655772ea in lttng_evaluation_event_rule_matches_create (condition=0x557f597c8520,
     capture_payload=0x557f597c825b "\221\240", capture_payload_size=2, decode_capture_payload=true)
     at conditions/event-rule-matches.cpp:1398
 #9  0x00007f1f655765fc in lttng_evaluation_event_rule_matches_create_from_payload (condition=0x557f597c8520,
     view=0x7ffcc9675ff0, _evaluation=0x7ffcc9676080) at conditions/event-rule-matches.cpp:990
 #10 0x00007f1f6557f273 in lttng_evaluation_create_from_payload (condition=0x557f597c8520, src_view=0x7ffcc9676100,
     evaluation=0x7ffcc9676080) at evaluation.cpp:120
 #11 0x00007f1f6559ba36 in lttng_notification_create_from_payload (src_view=0x7ffcc9676190, notification=0x7ffcc9676180)
     at notification.cpp:123
 #12 0x00007f1f65552577 in create_notification_from_current_message (channel=0x557f597c8ee0) at channel.cpp:124
 #13 0x00007f1f6555298c in lttng_notification_channel_get_next_notification (channel=0x557f597c8ee0, _notification=0x7ffcc9676280)
     at channel.cpp:292

The msgpack API represents string as p-style while the implementation of
event_field_value relies on null-terminated strings. When an empty
string is captured by a tracer, it is decoded as a msgpack_object with
`str = {size = 0, ptr = 0x0}`.

lttng_event_field_value_string_create_with_size does not require a
null-terminated string since it also receives the length. Hence, this
fix causes lttng_event_field_value_string_create_with_size to accept
null strings when their length is zero. A copy of an empty string is
made to accomodate the null-termination convention used by the rest of
that API.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I7c3a839dbbeeb95a1b3bf6ddc3205a2f6b4538e3
jgalar pushed a commit that referenced this pull request Jan 6, 2023
Observed issue
==============

While a snapshot is being taken, the containing folder can disappear
unexpectedly. This can lead to the following errors, which are expected
and mostly handled fine:

PERROR - 14:47:32.002564464 [2922498/2922507]: Failed to open file relative to trace chunk file_path = "channel0_0", flags = 577, mode = 432: No such file or directory (in _lttng_trace_chunk_open_fs_handle_locked() at trace-chunk.cpp:1411)
Error: Failed to open stream file "channel0_0"
Error: Snapshot channel failed

The problem happens on the subsequent snapshot for the session:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007fbbdadb3859 in __GI_abort () at abort.c:79
 #2  0x00007fbbdadb3729 in __assert_fail_base (fmt=0x7fbbdaf49588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-co
 #3  0x00007fbbdadc5006 in __GI___assert_fail (assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-consumer/kernel-consumer.cpp", line=188, function=0x55c4212cfb00 "
 #4  0x000055c421268cc6 in lttng_kconsumer_snapshot_channel (channel=0x7fbbc4000b60, key=1, path=0x7fbbd37f8fd4 "", relayd_id=18446744073709551615, nb_packets_per_stream=0) at kernel-consume
 #5  0x000055c42126b39d in lttng_kconsumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at kernel-consumer/kernel-consumer.cpp:986
 #6  0x000055c4212546d1 in lttng_consumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at consumer/consumer.cpp:2090
 #7  0x000055c421259963 in consumer_thread_sessiond_poll (data=0x55c421b80a90) at consumer/consumer.cpp:3281
 #8  0x00007fbbdaf8b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007fbbdaeb0163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

How to reproduce:

 1. Setting a breakpoint on snapshot_channel() inside
    src/common/ust-consumer/ust-consumer.cpp
 2. When the breakpoint hits, remove the the complete lttng directory
    containing the session data.
 3. Continue the lttng_consumerd process from gdb.
 4. In that case you see a negative return value -1 from
    consumer_stream_create_output_files() inside snapshot_channel().
 5. Take another snapshot and lttng_consumerd crashes because
    of the `assert(!stream->trace_chunk)` in snapshot_channel().

    This last action does not require any breakpoint intervention.

Cause
=====

During the snapshot, the stream is assigned the channel current chunk.
It is expected that the stream does not have a chunk at this point.

The error handling is faulty here, the stream chunk must be
invalidated/reset on error to allow its reuse later on.

The problem exists for both consumer domains (user/kernel).

Solution
========

For the ust consumer, we can directly use the `error_close_stream`
label.

For the kernel consumer, the code path is slightly different since it
does not uses `consumer_stream_close`. Note that `consumer_stream_close`
cannot be used as is for the kernel consumer. The current implementation
partially resembles `consumer_stream_close` at the end of the iteration.
It is extracted to its own function for easier reuse from the new
`error_finalize_stream` label.

Known drawbacks
=========

None.

Fixes: #1352

Signed-off-by: Marcel Hamer <[email protected]>
Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9fc81917b19aa436ed8e8679672648f2d5baf41a
jgalar pushed a commit that referenced this pull request Jan 6, 2023
Observed issue
==============

While a snapshot is being taken, the containing folder can disappear
unexpectedly. This can lead to the following errors, which are expected
and mostly handled fine:

PERROR - 14:47:32.002564464 [2922498/2922507]: Failed to open file relative to trace chunk file_path = "channel0_0", flags = 577, mode = 432: No such file or directory (in _lttng_trace_chunk_open_fs_handle_locked() at trace-chunk.cpp:1411)
Error: Failed to open stream file "channel0_0"
Error: Snapshot channel failed

The problem happens on the subsequent snapshot for the session:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007fbbdadb3859 in __GI_abort () at abort.c:79
 #2  0x00007fbbdadb3729 in __assert_fail_base (fmt=0x7fbbdaf49588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-co
 #3  0x00007fbbdadc5006 in __GI___assert_fail (assertion=0x55c4212cfbb5 "!stream->trace_chunk", file=0x55c4212cf820 "kernel-consumer/kernel-consumer.cpp", line=188, function=0x55c4212cfb00 "
 #4  0x000055c421268cc6 in lttng_kconsumer_snapshot_channel (channel=0x7fbbc4000b60, key=1, path=0x7fbbd37f8fd4 "", relayd_id=18446744073709551615, nb_packets_per_stream=0) at kernel-consume
 #5  0x000055c42126b39d in lttng_kconsumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at kernel-consumer/kernel-consumer.cpp:986
 #6  0x000055c4212546d1 in lttng_consumer_recv_cmd (ctx=0x55c421b80a90, sock=31, consumer_sockpoll=0x7fbbd37fd280) at consumer/consumer.cpp:2090
 #7  0x000055c421259963 in consumer_thread_sessiond_poll (data=0x55c421b80a90) at consumer/consumer.cpp:3281
 #8  0x00007fbbdaf8b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007fbbdaeb0163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

How to reproduce:

 1. Setting a breakpoint on snapshot_channel() inside
    src/common/ust-consumer/ust-consumer.cpp
 2. When the breakpoint hits, remove the the complete lttng directory
    containing the session data.
 3. Continue the lttng_consumerd process from gdb.
 4. In that case you see a negative return value -1 from
    consumer_stream_create_output_files() inside snapshot_channel().
 5. Take another snapshot and lttng_consumerd crashes because
    of the `assert(!stream->trace_chunk)` in snapshot_channel().

    This last action does not require any breakpoint intervention.

Cause
=====

During the snapshot, the stream is assigned the channel current chunk.
It is expected that the stream does not have a chunk at this point.

The error handling is faulty here, the stream chunk must be
invalidated/reset on error to allow its reuse later on.

The problem exists for both consumer domains (user/kernel).

Solution
========

For the ust consumer, we can directly use the `error_close_stream`
label.

For the kernel consumer, the code path is slightly different since it
does not uses `consumer_stream_close`. Note that `consumer_stream_close`
cannot be used as is for the kernel consumer. The current implementation
partially resembles `consumer_stream_close` at the end of the iteration.
It is extracted to its own function for easier reuse from the new
`error_finalize_stream` label.

Known drawbacks
=========

None.

Fixes: #1352

Signed-off-by: Marcel Hamer <[email protected]>
Signed-off-by: Jonathan Rajotte <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I9fc81917b19aa436ed8e8679672648f2d5baf41a
jgalar pushed a commit that referenced this pull request Jan 6, 2023
When converting msgpack objects to their event_field_value equivalent,
the following assertion fails: LTTNG_ASSERT(val);

 #4  0x00007f1f65349486 in __assert_fail () from /usr/lib/libc.so.6
 #5  0x00007f1f65584da7 in lttng_event_field_value_string_create_with_size (val=0x0, size=0) at event-field-value.cpp:186
 #6  0x00007f1f65576a1a in event_field_value_from_obj (obj=0x557f597ccdb8, field_val=0x7ffcc9675dd0)
     at conditions/event-rule-matches.cpp:1120
 #7  0x00007f1f65577176 in event_field_value_from_capture_payload (condition=0x557f597c8520,
     capture_payload=0x557f597c825b "\221\240", capture_payload_size=2) at conditions/event-rule-matches.cpp:1340
 #8  0x00007f1f655772ea in lttng_evaluation_event_rule_matches_create (condition=0x557f597c8520,
     capture_payload=0x557f597c825b "\221\240", capture_payload_size=2, decode_capture_payload=true)
     at conditions/event-rule-matches.cpp:1398
 #9  0x00007f1f655765fc in lttng_evaluation_event_rule_matches_create_from_payload (condition=0x557f597c8520,
     view=0x7ffcc9675ff0, _evaluation=0x7ffcc9676080) at conditions/event-rule-matches.cpp:990
 #10 0x00007f1f6557f273 in lttng_evaluation_create_from_payload (condition=0x557f597c8520, src_view=0x7ffcc9676100,
     evaluation=0x7ffcc9676080) at evaluation.cpp:120
 #11 0x00007f1f6559ba36 in lttng_notification_create_from_payload (src_view=0x7ffcc9676190, notification=0x7ffcc9676180)
     at notification.cpp:123
 #12 0x00007f1f65552577 in create_notification_from_current_message (channel=0x557f597c8ee0) at channel.cpp:124
 #13 0x00007f1f6555298c in lttng_notification_channel_get_next_notification (channel=0x557f597c8ee0, _notification=0x7ffcc9676280)
     at channel.cpp:292

The msgpack API represents string as p-style while the implementation of
event_field_value relies on null-terminated strings. When an empty
string is captured by a tracer, it is decoded as a msgpack_object with
`str = {size = 0, ptr = 0x0}`.

lttng_event_field_value_string_create_with_size does not require a
null-terminated string since it also receives the length. Hence, this
fix causes lttng_event_field_value_string_create_with_size to accept
null strings when their length is zero. A copy of an empty string is
made to accomodate the null-termination convention used by the rest of
that API.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I7c3a839dbbeeb95a1b3bf6ddc3205a2f6b4538e3
jgalar added a commit that referenced this pull request Jun 6, 2023
Issue observed
--------------

When using the CLI to list the configuration of a session that has an
event rule which makes use of multiple exclusions, the session daemon
crashes with the following stack trace:

  (gdb) bt
  #0  0x00007fa9ed401445 in ?? () from /usr/lib/libc.so.6
  #1  0x0000560cd5fc5199 in lttng_strnlen (str=0x615f6f6c6c6568 <error: Cannot access memory at address 0x615f6f6c6c6568>, max=256) at ../../src/common/compat/string.h:19
  #2  0x0000560cd5fc6b39 in lttng_event_serialize (event=0x7fa9cc01d8b0, exclusion_count=2, exclusion_list=0x7fa9cc011794, filter_expression=0x0, bytecode_len=0, bytecode=0x0, payload=0x7fa9d3ffda88) at event.c:767
  #3  0x0000560cd5f380b5 in list_lttng_ust_global_events (nb_events=<synthetic pointer>, reply_payload=0x7fa9d3ffda88, ust_global=<optimized out>, channel_name=<optimized out>) at cmd.c:472
  #4  cmd_list_events (domain=<optimized out>, session=<optimized out>, channel_name=<optimized out>, reply_payload=0x7fa9d3ffda88) at cmd.c:3860
  #5  0x0000560cd5f6d76a in process_client_msg (cmd_ctx=0x7fa9d3ffa710, sock=0x7fa9d3ffa5b0, sock_error=0x7fa9d3ffa5b4) at client.c:1890
  #6  0x0000560cd5f6f876 in thread_manage_clients (data=0x560cd7879490) at client.c:2629
  #7  0x0000560cd5f65a54 in launch_thread (data=0x560cd7879500) at thread.c:66
  #8  0x00007fa9ed32d44b in ?? () from /usr/lib/libc.so.6
  #9  0x00007fa9ed3b0e40 in ?? () from /usr/lib/libc.so.6

Cause
-----

lttng_event_serialize expects a `char **` list of exclusion names, as
provided by the other callsite in liblttng-ctl. However, the callsite in
list_lttng_ust_global_events passes pointer to the exclusions as stored
in lttng_event_exclusion.

lttng_event_exclusion contains an array of fixed-length strings (with a
stride of 256 bytes) which isn't an expected layout for
lttng_event_serialize.

Solution
--------

A temporary array of pointers is constructed before invoking
lttng_event_serialize to construct a list of exclusions with the layout
that lttng_event_serialize expects.

The array itself is reused for all events, limiting the number of
allocations.

Note
----

None.

Change-Id: I266a1cc9e9f18e0476177a0047b1d8f468110575
Signed-off-by: Jérémie Galarneau <[email protected]>
jgalar added a commit that referenced this pull request Jun 6, 2023
Issue observed
--------------

When using the CLI to list the configuration of a session that has an
event rule which makes use of multiple exclusions, the session daemon
crashes with the following stack trace:

  (gdb) bt
  #0  0x00007fa9ed401445 in ?? () from /usr/lib/libc.so.6
  #1  0x0000560cd5fc5199 in lttng_strnlen (str=0x615f6f6c6c6568 <error: Cannot access memory at address 0x615f6f6c6c6568>, max=256) at ../../src/common/compat/string.h:19
  #2  0x0000560cd5fc6b39 in lttng_event_serialize (event=0x7fa9cc01d8b0, exclusion_count=2, exclusion_list=0x7fa9cc011794, filter_expression=0x0, bytecode_len=0, bytecode=0x0, payload=0x7fa9d3ffda88) at event.c:767
  #3  0x0000560cd5f380b5 in list_lttng_ust_global_events (nb_events=<synthetic pointer>, reply_payload=0x7fa9d3ffda88, ust_global=<optimized out>, channel_name=<optimized out>) at cmd.c:472
  #4  cmd_list_events (domain=<optimized out>, session=<optimized out>, channel_name=<optimized out>, reply_payload=0x7fa9d3ffda88) at cmd.c:3860
  #5  0x0000560cd5f6d76a in process_client_msg (cmd_ctx=0x7fa9d3ffa710, sock=0x7fa9d3ffa5b0, sock_error=0x7fa9d3ffa5b4) at client.c:1890
  #6  0x0000560cd5f6f876 in thread_manage_clients (data=0x560cd7879490) at client.c:2629
  #7  0x0000560cd5f65a54 in launch_thread (data=0x560cd7879500) at thread.c:66
  #8  0x00007fa9ed32d44b in ?? () from /usr/lib/libc.so.6
  #9  0x00007fa9ed3b0e40 in ?? () from /usr/lib/libc.so.6

Cause
-----

lttng_event_serialize expects a `char **` list of exclusion names, as
provided by the other callsite in liblttng-ctl. However, the callsite in
list_lttng_ust_global_events passes pointer to the exclusions as stored
in lttng_event_exclusion.

lttng_event_exclusion contains an array of fixed-length strings (with a
stride of 256 bytes) which isn't an expected layout for
lttng_event_serialize.

Solution
--------

A temporary array of pointers is constructed before invoking
lttng_event_serialize to construct a list of exclusions with the layout
that lttng_event_serialize expects.

The array itself is reused for all events, limiting the number of
allocations.

Note
----

None.

Change-Id: I266a1cc9e9f18e0476177a0047b1d8f468110575
Signed-off-by: Jérémie Galarneau <[email protected]>
jgalar added a commit that referenced this pull request Jul 25, 2023
Issue observed
--------------

When running the session daemon under ASAN, the following report is
produced:

  Direct leak of 104 byte(s) in 1 object(s) allocated from:
      #0 0x7f93866e0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
      #1 0x55c55a7c4963 in zmalloc_internal /home/simark/src/lttng-tools/src/common/macros.hpp:60
      #2 0x55c55a7c4973 in lttng_pipe* zmalloc<lttng_pipe>() /home/simark/src/lttng-tools/src/common/macros.hpp:88
      #3 0x55c55a7c26eb in _pipe_create /home/simark/src/lttng-tools/src/common/pipe.cpp:111
      #4 0x55c55a7c351d in lttng_pipe_open(int) /home/simark/src/lttng-tools/src/common/pipe.cpp:185
      #5 0x55c55a586dd6 in operator() /home/simark/src/lttng-tools/src/bin/lttng-sessiond/rotation-thread.cpp:403
      #6 0x55c55a58744a in lttng::sessiond::rotation_thread::rotation_thread(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&) /home/simark/src/lttng-tools/src/bin/lttng-sessiond/rotation-thread.cpp:402
      #7 0x55c55a46377f in std::unique_ptr<lttng::sessiond::rotation_thread, std::default_delete<lttng::sessiond::rotation_thread> > lttng::make_unique<lttng::sessiond::rotation_thread, lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&>(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&) /home/simark/src/lttng-tools/src/common/make-unique.hpp:18
      #8 0x55c55a455024 in _main /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1773
      #9 0x55c55a455c2e in main /home/simark/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1982
      #10 0x7f9385c1484f  (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)

Cause
-----

On destruction, the std::unique_ptr wrapper of
lttng_pipe (lttng_pipe::uptr) invokes `lttng_pipe_close` (which only
closes the file descriptors of the underlying pipe) rather than
`lttng_pipe_destroy` which closes the file descriptors _and_ frees the
memory allocated by lttng_open.

Currently, the rotation thread is the only user of this wrapper (through
its quit_pipe).

Solution
--------

The deleter of lttng_pipe::uptr is replaced to invoke lttng_pipe_destroy.

Fixes #1380
Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I5715ac6131c5aa134cfd18d8b677f31aabed36f0
jgalar added a commit that referenced this pull request Jul 25, 2023
Issue observed
--------------

ASAN reports the following leak when running the
tests/regression/tools/context/test_ust.py test suite:

  Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f32e5ae0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x5653e1092088 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x5653e10922b3 in char* calloc<char>(unsigned long) string-utils/../macros.hpp:113
    #3 0x5653e119d68f in get_context_type commands/add_context.cpp:1012
    #4 0x5653e119ddf5 in cmd_add_context(int, char const**) commands/add_context.cpp:1059
    #5 0x5653e11e12e7 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:237
    #6 0x5653e11e2027 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #7 0x5653e11e24e1 in _main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:474
    #8 0x5653e11e25bd in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:485
    #9 0x7f32e3e3984f  (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)

  Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x7f32e5ae0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x5653e1092088 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x5653e10922b3 in char* calloc<char>(unsigned long) string-utils/../macros.hpp:113
    #3 0x5653e119d2ae in get_context_type commands/add_context.cpp:1003
    #4 0x5653e119ddf5 in cmd_add_context(int, char const**) commands/add_context.cpp:1059
    #5 0x5653e11e12e7 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:237
    #6 0x5653e11e2027 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #7 0x5653e11e24e1 in _main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:474
    #8 0x5653e11e25bd in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:485
    #9 0x7f32e3e3984f  (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)

Cause
-----

The context and provider names are dynamically allocated by
get_context_type() and stored in ctx_type. However, destroy_ctx_type()
never frees those members when the structure is of type
CONTEXT_APP_CONTEXT.

Solution
--------

Free both names when an application context type is destroyed.

Signed-off-by: Jérémie Galarneau <[email protected]>
Change-Id: I86dde1eed9f0cc63499c936cf373b094168035e2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant