Skip to content

feat: add global task framework#36368

Merged
villebro merged 11 commits intoapache:masterfrom
villebro:villebro/gatf
Feb 9, 2026
Merged

feat: add global task framework#36368
villebro merged 11 commits intoapache:masterfrom
villebro:villebro/gatf

Conversation

@villebro
Copy link
Member

@villebro villebro commented Dec 2, 2025

SUMMARY

This PR introduces the Global Task Framework (GTF) (scope changed from GATF to GTF to also support sync tasks), a unified system for managing long running tasks in Superset. GTF provides task execution, progress tracking, cancellation with graceful abort handling, and deduplication with scope-aware visibility.

Screen.Recording.2026-01-22.at.7.04.18.PM.mov

Key Features

  • Full Observability: Unified Task List UI for monitoring status, progress, payloads, errors, and warnings at a glance
  • Async + Sync Execution: Tasks can be scheduled on Celery or executed inline with full deduplication and tracking
  • Sync Join-and-Wait: Sync callers joining an existing task block until completion, enabling efficient resource sharing
  • Graceful Cancellation: Abort handlers enable safe task interruption at developer-defined checkpoints
  • Scope-based Visibility: Private, shared, and system task scopes with appropriate access control
  • Smart Concurrency: Locks for user-facing commands, lock-free atomic updates during execution, graceful race handling
  • Low Overhead: Throttled DB writes for task updates; optional Redis pub/sub for real-time abort and completion notifications

Infrastructure Improvements

This PR also introduces foundational infrastructure improvements that benefit GTF but are designed for broader reuse:

  • Signal Cache: New SIGNAL_CACHE_CONFIG configuration enables Redis-based pub/sub for real-time notifications and low-overhead distributed locking. This reduces metastore load and provides near-instant event delivery for abort signals and task completion notifications. In the future, GLOBAL_ASYNC_QUERIES_CACHE_BACKEND will be consolidated into this unified signal cache, providing a single Redis configuration for all signaling and coordination features.
  • Improved Distributed Locking: Optimized the KeyValue-based distributed lock (reduced from 4 to 3 metastore queries per lock cycle) and added Redis-based locking support. When Redis is configured, locks use atomic SET NX EX operations, moving lock overhead entirely to cache with only two round trips per lock cycle.

TECHNICAL DETAILS

Architecture Overview

                          create
                            │
                            ▼
                      ┌──────────┐
         ┌────────────│ PENDING  │
         │            └────┬─────┘
         │                 │
         │                 │ execute_task picks up
         │                 │
abort    │                 ▼
before   │              ┌──────────────┐
execution│              │ IN_PROGRESS  │
         │              │ is_abortable │
         │              │   = false    │
         │              └──────┬───────┘
         │                     │
         │      ┌──────────────┼────────────┐
         │      │              │            │
         │      ▼              │            │
         │  on_abort()         │            │
         │  registered         │            │
         │      │              │            │
         │      ▼              │            │
         │ ┌──────────────┐    │            │
         │ │ IN_PROGRESS  │    │            │
         │ │ is_abortable │    │            │
         │ │   = true     │    │            │
         │ └──────┬───────┘    │            │
         │        │            │            │
         │   abort called      │            │
         │   due to user       │            │
         │   cancellation      │            │
         │   or timeout        │            │
         │        │            │            │
         │        │            │            │
         │        ▼            │            │
         │  ┌──────────┐       │            │
         │  │ ABORTING │-------┤            │
         │  └┬─────────┘       │            │
         │   │                 │            │
         │   │ abort handlers  │ exception  │ task
         │   │ complete        │ raised     │ completes
         │   │                 │            │
         │   ▼                 ▼            ▼
         │  optional cleanup handlers complete
         │   │                 │            │
         ▼   ▼                 ▼            ▼
    ┌────────────┐      ┌─────────┐  ┌─────────┐
    │ ABORTED or │      │ FAILURE │  │ SUCCESS │
    │ TIMED_OUT  │      │         │  │         │
    └────────────┘      └─────────┘  └─────────┘
         │                   │            │
         └───────────────────┴────────────┘
                             │
                      Terminal States

Cancellation Logic

GTF uses a unified Cancel action that determines whether to abort a task or unsubscribe the user, based on task scope and subscriber count.

The Core Principle

The cancellation system is designed around a simple idea: a task should only be fully aborted when no one needs it anymore. For private tasks, this is straightforward—there's only one user. For shared tasks, the system distinguishes between "I don't need this anymore" (unsubscribe) and "stop this for everyone" (abort).

How Cancel Decides What to Do

When a user clicks Cancel, the system evaluates the situation in this order:

  1. Private Tasks: Abort. Since only the creator can see private tasks, cancelling means stopping the task.

  2. System Tasks: Abort. Only admins can see system tasks.

  3. Shared Tasks with Single Subscriber: Abort. The sole subscriber cancelling it stops the task.

  4. Shared Tasks with Multiple Subscribers: The user is unsubscribed, and the task continues for others.

  5. Admin with Force Abort: Stops the task for all subscribers. For admins on shared tasks with multiple subscribers, a "Force abort (stops task for all subscribers)" checkbox appears in the cancel modal. If the admin is not subscribed to the task, this checkbox is pre-checked and disabled (since aborting is the only sensible action—they can't unsubscribe from something they're not subscribed to). If the admin is subscribed, the checkbox is enabled and unchecked by default. When the admin is the sole subscriber, the checkbox is hidden since cancelling will automatically abort the task.

The Last Subscriber Rule

When the final subscriber unsubscribes from a shared task, the task is automatically aborted since no users remain to receive the results. The user clicks Cancel, and if they are the last subscriber, the task stops.

Abortability and In-Progress Tasks

Not all in-progress tasks can be aborted. A task is only abortable if it has registered an abort handler (via ctx.on_abort()):

  • Pending tasks: Always abortable—they simply won't start
  • In-progress with abort handler: Transition to ABORTING state, notify the running task
  • In-progress without abort handler: Reject the abort request with an error

This ensures developers must explicitly opt-in to abort support by providing abortion handling logic. If no abort hander is provided, the cancellation action is disabled in the list view for in-progress tasks with a tooltip:

image

Timeout Handling

Tasks can be configured with a timeout that automatically triggers the abort flow when the duration is exceeded.

Setting Timeouts

# Decorator-level default timeout (5 minutes)
@task(timeout=300)
def process_data(dataset_id: int) -> None:
    ctx = get_context()

    @ctx.on_abort
    def handle_abort():
        # Called when timeout expires or user cancels
        pass

    # ... task logic

# Override at runtime
task = process_data.schedule(
    dataset_id=123,
    options=TaskOptions(timeout=600)  # 10-minute override
)

Timeout Behavior

The timeout timer starts when the task transitions to IN_PROGRESS. When timeout expires:

  1. With abort handler registered:

    • Task transitions to ABORTING state
    • Abort handlers are triggered (same flow as user cancellation)
    • If handlers complete successfully → TIMED_OUT status
    • If handlers throw an exception → FAILURE status
  2. Without abort handler:

    • Warning is logged
    • Task continues running (cannot be forcibly stopped)
    • UI shows ⚠️ warning indicator
image

ABORTED vs TIMED_OUT

Both flows use the same ABORTING intermediate state, but the terminal state differs based on cause:

Cause Terminal Status
User cancellation ABORTED
Timeout expiration TIMED_OUT
Handler exception FAILURE

This distinction allows users to immediately understand why a task stopped without reading error messages.

Abort Detection: Polling vs Pub/Sub

GTF supports two abort detection mechanisms:

1. Database Polling (Default)

# From superset/tasks/manager.py - _poll_for_abort()

while not stop_event.is_set():
    task = TaskDAO.find_one_or_none(uuid=task_uuid)
    if task and task.status in [TaskStatus.ABORTING, TaskStatus.ABORTED]:
        callback()  # Trigger abort handlers
        break
    stop_event.wait(timeout=interval)  # Default: 10 seconds

Pros: No infrastructure dependencies
Cons: Up to N-second latency, periodic database queries

2. Redis Pub/Sub (When Configured)

When SIGNAL_CACHE_CONFIG is configured, GTF uses Redis pub/sub for instant abort notification:

# Abort request publishes to per-task channel
# From superset/tasks/manager.py - publish_abort()

channel = f"gtf:abort:{task_uuid}"
redis.publish(channel, "abort")

# Task's abort listener subscribes and receives instantly
# From superset/tasks/manager.py - _listen_pubsub()

pubsub = redis.pubsub()
pubsub.subscribe(f"gtf:abort:{task_uuid}")

while not stop_event.is_set():
    message = pubsub.get_message(timeout=1.0)
    if message and message.get("type") == "message":
        callback()  # Instant abort notification
        break
Aspect Polling Pub/Sub
Latency Up to 10s (configurable) ~milliseconds
DB Load Periodic queries None for abort detection
Infrastructure None Requires Redis

Concurrency & Performance

GTF uses two complementary strategies to handle concurrent operations, optimized for their different usage patterns and performance requirements.

Why Two Strategies?

GTF uses atomic SQL wherever possible—it adds minimal overhead and provides the same safety guarantees as locking when the operation can be expressed as a conditional update. Distributed locks are reserved for operations that require read-then-write semantics.

Strategy Used For Why
Atomic SQL Status transitions, progress updates Minimal overhead. Conditional WHERE clauses (e.g., WHERE status = 'IN_PROGRESS') provide race-safe updates in a single query.
Distributed Locks Submit, Cancel Read-then-write semantics—must check current state before deciding what to do. With Redis, locks are 2 cache operations; without, 3 metastore queries.

Operations Reference

Operation Concurrency DB Queries (KV Lock) DB Queries (Redis Lock) If Race Detected
Submit (new task) Lock 6: 1S + 3I + 2D 3: 1S + 2I (+2 Redis) N/A (lock held)
Submit (join existing) Lock 5: 1S + 2I + 2D 2: 1S + 1I (+2 Redis) N/A (lock held)
Cancel (abort) Lock 5: 1S + 1I + 1U + 2D 2: 1S + 1U (+2 Redis) N/A (lock held)
Cancel (unsubscribe) Lock 5: 1S + 1I + 3D 2: 1S + 1D (+2 Redis) N/A (lock held)
Start execution Atomic 1: 1U 1: 1U Task was aborted → skip
Progress update Atomic 1: 1U (throttled) 1: 1U (throttled) N/A (unconditional)
Mark success Atomic 1: 1U 1: 1U Abort won → finally handles
Mark aborted Atomic 1: 1U 1: 1U Already terminal → no-op
Abort polling None 1: 1S per interval (10s) 0 (uses pub/sub) N/A

Lock overhead for KV backend: DELETE expired + INSERT to acquire, DELETE to release (3 metastore queries, reduced from 4 by eliminating a redundant SELECT). With SIGNAL_CACHE_CONFIG, locking uses Redis SET NX EX + DELETE (2 cache operations), moving lock operations to the cache and freeing the metastore entirely.

When atomic updates return 0 rows affected (race detected), the executor accepts the concurrent outcome gracefully—no retries or errors. Cleanup handlers always run via the finally block regardless of which transition "won."

Progress Update Throttling

update_task() calls are throttled to a configurable minimum interval (default: 2 seconds) between database writes:

  • In-memory caches are updated immediately for every call
  • DB writes are batched to prevent overly eager tasks from overloading the metastore
  • A deferred flush timer ensures pending updates are persisted within the throttle window

Configure via TASK_PROGRESS_UPDATE_THROTTLE_INTERVAL (set to 0 to disable).

Implementation Details

Lock ordering: Submit and Cancel acquire the lock before opening a database transaction. This prevents holding a DB connection while waiting for the lock. DAOs perform pure data operations and assume the caller holds the lock.

Post-commit notifications: When aborting a task, publish_abort() is called after the transaction commits. This ensures the ABORTING state is visible in the database before any pub/sub listener queries it.

When to Leverage GTF

Recommended for:

  • Long-running tasks (>1 second) where users benefit from progress visibility
  • Tasks that should be deduplicated (e.g., report generation, cache warming)
  • Operations that users may want to cancel
  • Shared tasks where multiple users need visibility

May be unnecessary for:

  • High-frequency, sub-second tasks
  • Fire-and-forget background jobs
  • Tasks where the overhead exceeds the execution time

Deduplication Strategy

Deduplication uses a hashed dedup_key column with a unique index. The composite key is hashed using the configured HASH_ALGORITHM (default: SHA-256) to produce a fixed-length key that avoids performance and compatibility issues with long task_key values:

# From superset/tasks/utils.py - get_active_dedup_key()

# Build composite key based on scope
match scope:
    case TaskScope.PRIVATE:
        composite_key = f"{scope.value}|{task_type}|{task_key}|{user_id}"
    case TaskScope.SHARED:
        composite_key = f"{scope.value}|{task_type}|{task_key}"
    case TaskScope.SYSTEM:
        composite_key = f"{scope.value}|{task_type}|{task_key}"

# Hash to fixed length (64 chars for SHA-256, 32 chars for MD5)
return hash_from_str(composite_key)

Unified join semantics: When a task with matching dedup_key exists, the framework adds the user as a subscriber (if not already subscribed) and returns the existing task. This applies uniformly to all scopes—private tasks naturally have only one subscriber since their dedup_key includes the user_id.

Sync join-and-wait: When a sync caller (direct function call, not .schedule()) joins an existing active task, it blocks until the task completes rather than returning immediately. This uses the same notification mechanism as abort detection (Redis pub/sub if configured, otherwise database polling).

When a task completes, its dedup_key is changed to its UUID (36 chars, no hashing needed), freeing the slot:

# From superset/tasks/utils.py - get_finished_dedup_key()

def get_finished_dedup_key(task_uuid: str) -> str:
    return task_uuid  # Frees up the composite key for new tasks

Ambient Context Pattern

Tasks access execution context via get_context() using Python's contextvars:

# From superset/tasks/ambient_context.py

_current_context: ContextVar[TaskContext | None] = ContextVar("task_context", default=None)

@contextmanager
def use_context(ctx: TaskContext):
    token = _current_context.set(ctx)
    try:
        yield
    finally:
        _current_context.reset(token)

# From superset/tasks/scheduler.py - execute_task()

ctx = TaskContext(task_uuid=task_uuid)
with use_context(ctx):
    executor_fn(*args, **kwargs)  # Task can call get_context()

This eliminates the need to pass context through function signatures—tasks call get_context() wherever needed.


REST API Endpoints

The Task API provides read-only access to task data and cancellation capabilities. Tasks are created programmatically through the @task decorator and cleaned up automatically by scheduled prune jobs.

Method Endpoint Description
GET /api/v1/task/ List tasks (paginated, filterable)
GET /api/v1/task/{uuid} Get task details by UUID
GET /api/v1/task/{uuid}/status Lightweight status polling endpoint
POST /api/v1/task/{uuid}/cancel Cancel task (abort or unsubscribe)
GET /api/v1/task/related/created_by Get creators for filter dropdown
GET /api/v1/task/related/subscribers Get subscribers for filter dropdown
image

Why No DELETE or CREATE Endpoints?

  • No CREATE: Tasks are created programmatically when code calls my_task.schedule() or my_task(). The framework handles task creation, deduplication, and subscriber management internally.

  • No DELETE: Tasks are cleaned up by a scheduled prune job rather than manual deletion. This prevents accidental deletion of in-progress tasks and maintains audit trails. Configure the prune job in your Celery beat schedule (see superset/config.py for an example).


ADDITIONAL INFORMATION

Configuration

Enabling GTF

GTF is disabled by default and must be enabled via the GLOBAL_TASK_FRAMEWORK feature flag:

# In your superset_config.py
FEATURE_FLAGS = {
    "GLOBAL_TASK_FRAMEWORK": True,
}

When GTF is disabled:

  • The Task List UI menu item is hidden
  • The /api/v1/task/* endpoints return 404
  • Calling or scheduling a @task-decorated function raises GlobalTaskFrameworkDisabledError

This flag controls all GTF functionality. In the future, enabling this flag will also switch built-in features (thumbnails, alerts & reports, etc.) to use GTF-based tasks instead of legacy Celery tasks.

Configuration Options

Setting Default Description
GLOBAL_TASK_FRAMEWORK False Feature flag to enable GTF. Must be set to True to use GTF features.
SIGNAL_CACHE_CONFIG None Redis config for distributed locking and pub/sub notifications (abort, completion). Improves performance and reduces metastore load. In the future, GLOBAL_ASYNC_QUERIES_CACHE_BACKEND will be consolidated into this cache.
DISTRIBUTED_LOCK_DEFAULT_TTL 30 Default lock TTL in seconds (set to 30 seconds for backwards compatibility). Locks auto-expire to prevent deadlocks. GTF task operations use a shorter 10-second TTL.
TASKS_ABORT_CHANNEL_PREFIX "gtf:abort:" Redis channel prefix for abort messages
TASK_ABORT_POLLING_DEFAULT_INTERVAL 10 Abort polling interval in seconds (when Redis pub/sub not configured)
TASK_PROGRESS_UPDATE_THROTTLE_INTERVAL 2 Minimum interval in seconds between update_task() DB writes. Set to 0 to disable throttling.
SHOW_STACKTRACE True (dev) Expose stack traces in API responses

Database Schema

New tables:

  • tasks - Task metadata, status, and properties
  • task_subscribers - Many-to-many relationship for shared task subscriptions

The migration includes indexes for deduplication lookups, scope-based filtering in the list view, and efficient pruning of old completed tasks.

Relationship Loading Strategy

To avoid N+1 query issues when listing tasks with subscriber information, the models use SQLAlchemy eager loading:

  • Task.subscribers: Uses lazy="selectin" - when listing N tasks, fires 2 queries total (1 for tasks + 1 IN-clause query for all subscribers) instead of N+1
  • TaskSubscriber.user: Uses lazy="joined" - fetches user info (first_name, last_name) via JOIN in the same query as subscribers

This ensures the API list endpoint remains efficient regardless of task count or subscriber count.

Properties JSON Column

The tasks table uses a properties JSON column to store runtime state and execution configuration. This design provides schema flexibility for incrementally adding new features without requiring database migrations.

Current properties structure:

Field Type Description
execution_mode Literal["async", "sync"] Execution mode: "async" for scheduled (Celery) execution, "sync" for inline execution
timeout int Timeout in seconds
is_abortable bool Whether the task has registered an abort handler
progress_percent float Progress 0.0-1.0
progress_current int Current iteration count
progress_total int Total iterations (if known)
error_message str Human-readable error message
exception_type str Exception class name (e.g., "ValueError")
stack_trace str Full formatted traceback

Why JSON over columns:

  1. No migrations for new fields: Adding timeout support, retry policies, or other execution config only requires code changes
  2. Non-filterable data: These fields are displayed but not used in WHERE clauses, so indexing isn't needed
  3. API flexibility: Frontend receives a typed properties dict with consistent shape
  4. Forward compatibility: Unknown fields are preserved during serialization/deserialization

TypedDict-based implementation:

Properties use a TypedDict with total=False (all fields optional) for type safety without runtime overhead:

# superset-core/src/superset_core/api/tasks.py
class TaskProperties(TypedDict, total=False):
    # Execution config - set at task creation
    execution_mode: Literal["async", "sync"]
    timeout: int

    # Runtime state - set by framework during execution
    is_abortable: bool
    progress_percent: float
    progress_current: int
    progress_total: int

    # Error info - set when task fails
    error_message: str
    exception_type: str
    stack_trace: str

Access pattern (sparse dict):

# Reading - always use .get() since keys may be absent
task.properties_dict.get("is_abortable")  # Returns None if not set
task.properties_dict.get("progress_percent", 0.0)  # With default

# Writing - pass a dict with fields to update (merge semantics)
task.update_properties({"is_abortable": True})
task.update_properties(progress_update((50, 100)))  # Helper function

Helper functions (in superset/tasks/utils.py):

# Build progress update dict
progress_update((50, 100))  # Returns {"progress_current": 50, "progress_total": 100, "progress_percent": 0.5}
progress_update(0.5)        # Returns {"progress_percent": 0.5}
progress_update(42)         # Returns {"progress_current": 42}

# Build error update dict from exception
error_update(exception)     # Returns {"error_message": "...", "exception_type": "...", "stack_trace": "..."}

Breaking Changes

None - this is a new feature.

TODO / Future Work

  • Bulk Cancellation: Add ability to cancel multiple tasks at once (deferred due to UI complexity)
  • Generic Pub/Sub Service: Extract the Redis pub/sub logic from GTF into a reusable service/manager that can be leveraged by any feature requiring real-time notifications (e.g., cache invalidation, dashboard refresh signals, collaborative editing events)
  • Migrate existing async functionality to GTF:
    • Thumbnails generation
    • Alerts & Reports execution
    • SQL Lab query execution
    • Global Async Queries
    • Other async operations as identified, including table pruning tasks

During migration, abort handlers will be added to tasks that currently don't support termination, and deduplication support will be added where relevant to prevent duplicate task execution. Also, new properties will likely be added to the task decorator signature + TaskOptions data class as needed (retries, max queue times etc).


TESTING INSTRUCTIONS

Unit Tests

pytest tests/unit_tests/daos/test_tasks.py -v
pytest tests/unit_tests/tasks/ -v

Integration Tests

pytest tests/integration_tests/tasks/ -v

Manual Testing

  1. Create and execute a task:

    from superset_core.api.types import task, get_context
    
    @task
    def test_task():
        ctx = get_context()
        for i in range(10):
            ctx.update_task(progress=(i+1, 10))
            time.sleep(1)
    
    # Async: schedule on Celery worker
    task = test_task.schedule()
    
    # Sync: run inline (same deduplication and tracking)
    task = test_task()
  2. Test abort via UI: Navigate to Task List, click Cancel on an in-progress task

  3. Test Redis pub/sub: Configure SIGNAL_CACHE_CONFIG and verify instant abort response

ADDITIONAL INFORMATION

  • Has associated issue: [SIP-143] Global Async Task Framework #29839
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@github-actions github-actions bot added risk:db-migration PRs that require a DB migration api Related to the REST API labels Dec 2, 2025
@villebro villebro moved this from To Do to In Progress in Superset Extensions Dec 2, 2025
@villebro villebro self-assigned this Dec 2, 2025
@github-actions github-actions bot removed the api Related to the REST API label Dec 19, 2025
@codecov
Copy link

codecov bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 70.58824% with 530 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.67%. Comparing base (76d897e) to head (9d8dc67).
⚠️ Report is 3687 commits behind head on master.

Files with missing lines Patch % Lines
superset/tasks/manager.py 44.26% 121 Missing and 15 partials ⚠️
superset/tasks/decorators.py 18.18% 117 Missing ⚠️
superset/tasks/context.py 69.70% 53 Missing and 20 partials ⚠️
superset/tasks/api.py 77.34% 28 Missing and 1 partial ⚠️
superset/tasks/scheduler.py 62.66% 23 Missing and 5 partials ⚠️
superset/daos/tasks.py 83.07% 13 Missing and 9 partials ⚠️
superset/models/tasks.py 85.98% 12 Missing and 3 partials ⚠️
superset/tasks/utils.py 70.00% 9 Missing and 6 partials ⚠️
superset/commands/tasks/cancel.py 87.09% 6 Missing and 6 partials ⚠️
superset/commands/distributed_lock/acquire.py 71.05% 10 Missing and 1 partial ⚠️
... and 16 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #36368      +/-   ##
==========================================
+ Coverage   60.48%   66.67%   +6.18%     
==========================================
  Files        1931      671    -1260     
  Lines       76236    51593   -24643     
  Branches     8568     5770    -2798     
==========================================
- Hits        46114    34399   -11715     
+ Misses      28017    15810   -12207     
+ Partials     2105     1384     -721     
Flag Coverage Δ
hive 41.50% <34.51%> (-7.66%) ⬇️
javascript ?
mysql 64.59% <70.53%> (?)
postgres 64.66% <70.53%> (?)
presto 41.52% <34.51%> (-12.29%) ⬇️
python 66.46% <70.58%> (+2.95%) ⬆️
sqlite 64.26% <67.59%> (?)
superset-extensions-cli 96.49% <ø> (?)
unit 100.00% <100.00%> (+42.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions bot added api Related to the REST API doc Namespace | Anything related to documentation labels Jan 5, 2026
@netlify
Copy link

netlify bot commented Jan 7, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit ca1aa46
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/698a03c5c617d5000828a772
😎 Deploy Preview https://deploy-preview-36368--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@villebro villebro force-pushed the villebro/gatf branch 2 times, most recently from 434a7d8 to 63a7755 Compare January 14, 2026 02:57
@villebro villebro force-pushed the villebro/gatf branch 2 times, most recently from d0d323d to 0948773 Compare January 23, 2026 01:15
@villebro villebro changed the title feat: add global async task framework feat: add global task framework Jan 23, 2026
@villebro villebro marked this pull request as ready for review January 23, 2026 04:24
Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest changes @villebro. Second-pass review:

@villebro villebro force-pushed the villebro/gatf branch 2 times, most recently from 6fb1c4a to 1b0fdcd Compare February 7, 2026 16:50
@villebro
Copy link
Member Author

villebro commented Feb 7, 2026

Thanks for the re-review @michael-s-molina ! Please check the latest commit with the following changes (these should address the issues you raised + some other improvements):

  • task signatures, including API routes, have been simplified to be UUID only. Also the Task model was updated to use native UUID instead of having a string based type (slightly improved efficiency and performance)
  • A few lingering datetime.now(timezone.utc) fixes
  • Fixed all remaining instances of if TYPE_CHECKING: pass (there were unrelated ones, too, so I fixed them also)
  • Updated '/api/v1/task/{task_uuid}/status' endpoint to do a targeted select on the status column only to avoid pulling in the entire task object
  • Removed a few redundant endpoints that were pulled in by RouteMethod.REST_MODEL_VIEW_CRUD_SET (I noticed these when checking Swagger UI)
  • Added non-unique index to task_key
  • Made tasks load subscribers eagerly to avoid N=1 queries when listing tasks

@msyavuz msyavuz mentioned this pull request Feb 9, 2026
9 tasks
setTimeout(() => setCopied(false), 2000);
})
.catch(() => {
// Failed to copy, ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could display an error toast instead of failing silently?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea - done

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this great PR @villebro and for addressing hundreds of comments 😱

My only comment left is a non-blocking one about the log messages on tasks manager which I think should be debug instead of info. My main concern is with the volume of messages as you might have multiple entries for each scheduled task.

@bito-code-review
Copy link
Contributor

Bito Automatic Review Failed - Technical Failure

Bito encountered technical difficulties while generating code feedback . To retry, type /review in a comment and save. If the issue persists, contact support@bito.ai and provide the following details:

Agent Run ID: 8e572f6e-ab74-4319-9842-382b31c64c6d

@mistercrunch
Copy link
Member

mistercrunch commented Feb 9, 2026

I did some testing to see if supporting direct Celery scheduling with the new task decorator would be possible, but it added a lot of dangerous complexity to the already complex internals of the feature, so ultimately I decided against it.

Sounds reasonable. Main thing I'm slightly worried about is multiple execution paths for extended periods of time on the repo. Would love to have all tasks go through the same general codepaths/decorators/abstractions, even if some tasks don't provide the same guarantees (dedup on/off, database tracking on/off, ...). Doesn't have to be in this PR necessarily. Not sure how complex that would be either, would have to spend more time reviewing, but some sort of @global_task_framework(mode="legacy") decorator to wire all async/celery task could be nice, even if some modes/configs are no-op for now.

@villebro
Copy link
Member Author

villebro commented Feb 9, 2026

Sounds reasonable. Main thing I'm slightly worried about is multiple execution paths for extended periods of time on the repo. Would love to have all tasks go through the same general codepaths/decorators/abstractions, even if some tasks don't provide the same guarantees (dedup on/off, database tracking on/off, ...). Doesn't have to be in this PR necessarily. Not sure how complex that would be either, would have to spend more time reviewing, but some sort of @global_task_framework(mode="legacy") decorator to wire all async/celery task could be nice, even if some modes/configs are no-op for now.

The majority of logic should be shared. I'll be updating the actual functional code to have optional abort handlers that will simply be undefined for the legacy paths. But I'll give it another go when this is merged and I start working in earnest on the first migration.

@villebro villebro merged commit 59dd2fa into apache:master Feb 9, 2026
73 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in Superset Extensions Feb 9, 2026
@villebro villebro deleted the villebro/gatf branch February 9, 2026 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Related to the REST API change:backend Requires changing the backend change:frontend Requires changing the frontend dependencies:npm doc Namespace | Anything related to documentation packages risk:db-migration PRs that require a DB migration size/XXL

Projects

Development

Successfully merging this pull request may close these issues.

7 participants