Skip to content

Conversation

anencore94
Copy link
Member

@anencore94 anencore94 commented Sep 27, 2025

What this PR does / why we need it:

This PR implements comprehensive structured and configurable logging support for the Kubeflow SDK as requested in Issue #85. The implementation provides:

  • NullHandler Pattern: Prevents logging noise by default while allowing users to override with their own configuration
  • Configurable Logging: Support for console, detailed, and JSON output formats
  • Structured Logging: JSON formatter for log aggregation systems (ELK stack, Fluentd, etc.)
  • Environment-based Configuration: Configure logging via environment variables
  • SDK Integration: Added debug logging to TrainerClient for better observability

The logging system ensures that the SDK is quiet by default (using NullHandler) but provides rich logging capabilities when users explicitly configure logging.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):

Fixes #85

Features Implemented

Core Logging Infrastructure

  • NullHandler Pattern: Prevents logging noise by default, users can override with their own configuration
  • Configurable Logging: Support for console, detailed, and JSON output formats
  • Structured Logging: JSON formatter for log aggregation systems (ELK stack, Fluentd, etc.)
  • Environment-based Configuration: Configure logging via environment variables

Key Components

  • kubeflow/trainer/logging/config.py: Centralized logging configuration
  • kubeflow/trainer/logging/formatters.py: Custom formatters including JSON structured logging
  • kubeflow/trainer/logging/logging_test.py: Comprehensive test suite (16 tests)

Integration Points

  • SDK Integration: Added debug logging to TrainerClient for better observability
  • Package Integration: Exposed logging utilities through kubeflow.trainer.logging
  • NullHandler Setup: Configured at package level to prevent logging noise

Issue #85 Requirements Fulfilled

Consistent use of Python's logging library instead of print statements

  • All SDK operations now use proper Python logging
  • NullHandler pattern prevents unwanted output by default

Support for different logging levels (DEBUG, INFO, WARNING, ERROR)

  • Full support for all standard Python logging levels
  • Configurable via setup_logging() function or environment variables

Ability for users to configure log formatting and destinations

  • Console, detailed, and JSON output formats
  • File output support
  • Custom formatter support

Clear and actionable log messages for key SDK operations

  • Debug messages in TrainerClient initialization
  • Backend selection logging
  • Job creation and ID logging

Testing

  • 16 comprehensive unit tests covering all logging functionality
  • NullHandler pattern verification with proper isolation
  • SDK integration testing with real TrainerClient usage
  • Application integration examples with file and console logging
  • All tests pass with proper linting compliance

Usage Examples

Basic Usage

from kubeflow.trainer import TrainerClient, setup_logging

# Setup logging (optional - NullHandler prevents noise by default)
setup_logging(level="DEBUG", format_type="console")

# Use SDK - debug messages will appear if logging is configured
client = TrainerClient()

JSON Logging for Production

setup_logging(level="INFO", format_type="json")
# Logs will be in JSON format suitable for log aggregation

Environment Configuration

export KUBEFLOW_LOG_LEVEL=DEBUG
export KUBEFLOW_LOG_FORMAT=json

Breaking Changes

None - this is a pure addition with backward compatibility.

Migration Guide

No migration required. Existing code will continue to work without changes.
Users can optionally configure logging for better observability.

Checklist:

  • Docs included if any changes are user facing

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign electronic-waste for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@anencore94 anencore94 force-pushed the feat/logging-system-v2 branch from 149c8bc to 56896c3 Compare September 27, 2025 15:23
@anencore94 anencore94 changed the title feat(logging): Implement structured and configurable logging support feat: add structured and configurable logging support Sep 27, 2025
@anencore94 anencore94 marked this pull request as draft September 27, 2025 15:24
@anencore94 anencore94 changed the title feat: add structured and configurable logging support feat(trainer): add structured and configurable logging support Sep 27, 2025
@anencore94 anencore94 marked this pull request as ready for review September 27, 2025 15:55
@anencore94 anencore94 force-pushed the feat/logging-system-v2 branch 2 times, most recently from 575c624 to 6f99044 Compare September 27, 2025 16:35
This PR implements comprehensive logging support for the Kubeflow SDK as requested in Issue kubeflow#85.

## Features Implemented

### Core Logging Infrastructure
- **NullHandler Pattern**: Prevents logging noise by default, users can override with their own configuration
- **Configurable Logging**: Support for console, detailed, and JSON output formats
- **Structured Logging**: JSON formatter for log aggregation systems (ELK stack, Fluentd, etc.)
- **Environment-based Configuration**: Configure logging via environment variables

### Key Components
- `kubeflow/trainer/logging/config.py`: Centralized logging configuration
- `kubeflow/trainer/logging/formatters.py`: Custom formatters including JSON structured logging
- `kubeflow/trainer/logging/logging_test.py`: Comprehensive test suite (14 tests)

### Integration Points
- **SDK Integration**: Added debug logging to TrainerClient for better observability
- **Package Integration**: Exposed logging utilities through kubeflow.trainer.logging
- **NullHandler Setup**: Configured at package level to prevent logging noise

## Issue kubeflow#85 Requirements Fulfilled

✅ **Consistent use of Python's logging library instead of print statements**
- All SDK operations now use proper Python logging
- NullHandler pattern prevents unwanted output by default

✅ **Support for different logging levels (DEBUG, INFO, WARNING, ERROR)**
- Full support for all standard Python logging levels
- Configurable via setup_logging() function or environment variables

✅ **Ability for users to configure log formatting and destinations**
- Console, detailed, and JSON output formats
- File output support
- Custom formatter support

✅ **Clear and actionable log messages for key SDK operations**
- Debug messages in TrainerClient initialization
- Backend selection logging
- Job creation and ID logging

## Testing

- **14 comprehensive unit tests** covering all logging functionality
- **NullHandler pattern verification** with proper isolation
- **SDK integration testing** with real TrainerClient usage
- **Application integration examples** with file and console logging
- **All tests pass** with proper linting compliance

## Usage Examples

### Basic Usage
```python
from kubeflow.trainer import TrainerClient, setup_logging

# Setup logging (optional - NullHandler prevents noise by default)
setup_logging(level="DEBUG", format_type="console")

# Use SDK - debug messages will appear if logging is configured
client = TrainerClient()
```

### JSON Logging for Production
```python
setup_logging(level="INFO", format_type="json")
# Logs will be in JSON format suitable for log aggregation
```

### Environment Configuration
```bash
export KUBEFLOW_LOG_LEVEL=DEBUG
export KUBEFLOW_LOG_FORMAT=json
```

## Breaking Changes
None - this is a pure addition with backward compatibility.

## Migration Guide
No migration required. Existing code will continue to work without changes.
Users can optionally configure logging for better observability.

Resolves kubeflow#85

Signed-off-by: Jaeyeon Kim <[email protected]>
@anencore94 anencore94 force-pushed the feat/logging-system-v2 branch from 6f99044 to 9642b41 Compare September 27, 2025 16:41
@coveralls
Copy link

Pull Request Test Coverage Report for Build 18062345345

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 71.739%

Totals Coverage Status
Change from base Build 17979146135: 0.0%
Covered Lines: 297
Relevant Lines: 414

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add structured and configurable logging support to Kubeflow SDK

2 participants