Skip to content

Conversation

@santigimeno
Copy link
Member

Continuous Profiling Implementation

This PR implements continuous profiling capabilities in N|Solid, allowing automatic collection of CPU profiles at configurable intervals.

Key Changes

  • Core Implementation:

    • Created a new ContinuousProfiler class to manage automatic profile collection
    • Hardened NSolidCPUProfiler with improved stability checks and shutdown handling
    • Integrated continuous profiling with the environment list management system
  • Configuration Options:

    • Added environment variables:
      • NSOLID_CONT_CPU_PROFILE - Enable/disable continuous CPU profiling
      • NSOLID_CONT_CPU_PROFILE_INTERVAL - Set interval between profiles (default: 30s)
    • Added corresponding configuration properties:
      • contCpuProfile
      • contCpuProfileInterval
  • gRPC Agent Support:

    • Added ExportContinuousProfile RPC to the service protocol
    • Enhanced AssetStream to support continuous profile data transmission
    • Implemented conflict resolution between manual and continuous profiling
    • Added proper cleanup in shutdown sequence
  • Testing:

    • Added configuration tests for both API and environment variables
    • Created gRPC continuous profiling tests to verify data transmission and format

Technical Details

  • Profile data is collected using an AsyncTSQueue for improved reliability
  • Timestamps are included with profiles for accurate timeline representation
  • Default profiling interval is set to 30 seconds

@santigimeno santigimeno self-assigned this Mar 25, 2025
@santigimeno santigimeno changed the title Santi/continuous profiling Continuous Profiling Implementation Mar 25, 2025
@santigimeno santigimeno force-pushed the santi/continuous_profiling branch 3 times, most recently from e995322 to 4870740 Compare March 27, 2025 11:27
Copy link
Contributor

@juanarbol juanarbol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@santigimeno santigimeno force-pushed the santi/fix_flaky_otlp_grpc_metrics_test branch 2 times, most recently from c19d441 to f18e72e Compare April 10, 2025 13:17
@santigimeno santigimeno changed the base branch from santi/fix_flaky_otlp_grpc_metrics_test to node-v22.x-nsolid-v5.x April 10, 2025 13:19
Replace manual message handling with AsyncTSQueue for profile data.
Remove initialize() method as initialization is now handled in constructor.
Simplify profile callback implementation.
Remove process_profiles() as AsyncTSQueue handles this automatically.
Update ZMQ and gRPC agents to remove initialize() calls.
Add is_running checks to all public methods.
Prevent operations on deleted profiler instances.
Add early return guards to TakeCpuProfile, StopProfiling, and
StopProfilingSync.
Improve stability during shutdown sequence.
Add NSOLID_CONT_CPU_PROFILE environment variable support.
Add NSOLID_CONT_CPU_PROFILE_INTERVAL environment variable support.
Implement contCpuProfile and contCpuProfileInterval config options.
Set default interval to 30000ms (30 seconds).
Update initializeConfig and updateConfig functions.
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.
Add continuous profiler callback and queue.
Register hook with ContinuousProfiler during agent startup.
Implement got_continuous_profile method for handling profile data.
Add conflict resolution between manual and continuous profiling.
Update profile timestamp handling for better accuracy.
Add proper cleanup in shutdown sequence.
Add test-grpc-continuous-profile.mjs for testing continuous profiling.
Update gRPC agent server test fixture.
Test profile data transmission and format.
Verify continuous profiling configuration options.
Please enter the commit message for your changes. Lines starting.
@santigimeno santigimeno force-pushed the santi/continuous_profiling branch from 4870740 to e638bb7 Compare April 10, 2025 13:20
santigimeno added a commit that referenced this pull request Apr 14, 2025
Replace manual message handling with AsyncTSQueue for profile data.
Remove initialize() method as initialization is now handled in
constructor.
Simplify profile callback implementation.
Remove process_profiles() as AsyncTSQueue handles this automatically.
Update ZMQ and gRPC agents to remove initialize() calls.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Add is_running checks to all public methods.
Prevent operations on deleted profiler instances.
Add early return guards to TakeCpuProfile, StopProfiling, and
StopProfilingSync.
Improve stability during shutdown sequence.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Add NSOLID_CONT_CPU_PROFILE environment variable support.
Add NSOLID_CONT_CPU_PROFILE_INTERVAL environment variable support.
Implement contCpuProfile and contCpuProfileInterval config options.
Set default interval to 30000ms (30 seconds).
Update initializeConfig and updateConfig functions.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Apr 14, 2025
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Add continuous profiler callback and queue.
Register hook with ContinuousProfiler during agent startup.
Implement got_continuous_profile method for handling profile data.
Add conflict resolution between manual and continuous profiling.
Update profile timestamp handling for better accuracy.
Add proper cleanup in shutdown sequence.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 12, 2025
Add test-grpc-continuous-profile.mjs for testing continuous profiling.
Update gRPC agent server test fixture.
Test profile data transmission and format.
Verify continuous profiling configuration options.
Please enter the commit message for your changes. Lines starting.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add NSOLID_CONT_CPU_PROFILE environment variable support.
Add NSOLID_CONT_CPU_PROFILE_INTERVAL environment variable support.
Implement contCpuProfile and contCpuProfileInterval config options.
Set default interval to 30000ms (30 seconds).
Update initializeConfig and updateConfig functions.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add continuous profiler callback and queue.
Register hook with ContinuousProfiler during agent startup.
Implement got_continuous_profile method for handling profile data.
Add conflict resolution between manual and continuous profiling.
Update profile timestamp handling for better accuracy.
Add proper cleanup in shutdown sequence.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request May 15, 2025
Add test-grpc-continuous-profile.mjs for testing continuous profiling.
Update gRPC agent server test fixture.
Test profile data transmission and format.
Verify continuous profiling configuration options.
Please enter the commit message for your changes. Lines starting.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add NSOLID_CONT_CPU_PROFILE environment variable support.
Add NSOLID_CONT_CPU_PROFILE_INTERVAL environment variable support.
Implement contCpuProfile and contCpuProfileInterval config options.
Set default interval to 30000ms (30 seconds).
Update initializeConfig and updateConfig functions.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add continuous profiler callback and queue.
Register hook with ContinuousProfiler during agent startup.
Implement got_continuous_profile method for handling profile data.
Add conflict resolution between manual and continuous profiling.
Update profile timestamp handling for better accuracy.
Add proper cleanup in shutdown sequence.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
Add test-grpc-continuous-profile.mjs for testing continuous profiling.
Update gRPC agent server test fixture.
Test profile data transmission and format.
Verify continuous profiling configuration options.
Please enter the commit message for your changes. Lines starting.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add NSOLID_CONT_CPU_PROFILE environment variable support.
Add NSOLID_CONT_CPU_PROFILE_INTERVAL environment variable support.
Implement contCpuProfile and contCpuProfileInterval config options.
Set default interval to 30000ms (30 seconds).
Update initializeConfig and updateConfig functions.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Create and initialize ContinuousProfiler in EnvList.
Add configuration handling for continuous profiling options.
Update EnvList to enable/disable profiling based on config.
Add proper cleanup in shutdown sequence.
Expose ContinuousProfiler through GetContinuousProfiler() method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add test-nsolid-config-continuous-profiling.js for API configuration.
Add test-nsolid-config-continuous-profiling-env.js for environment
variables.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add ExportContinuousProfile RPC to nsolid_service.proto.
Add start_ts and end_ts fields to Asset message.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add AssetStreamRpcType enum to differentiate between RPC types.
Modify AssetStream constructor to accept RPC type parameter.
Update Write method to support both ExportAsset and
ExportContinuousProfile.
Add safety assertion for WritesDone method.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add continuous profiler callback and queue.
Register hook with ContinuousProfiler during agent startup.
Implement got_continuous_profile method for handling profile data.
Add conflict resolution between manual and continuous profiling.
Update profile timestamp handling for better accuracy.
Add proper cleanup in shutdown sequence.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
Add test-grpc-continuous-profile.mjs for testing continuous profiling.
Update gRPC agent server test fixture.
Test profile data transmission and format.
Verify continuous profiling configuration options.
Please enter the commit message for your changes. Lines starting.

PR-URL: #282
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants