Task Lifecycle and State Management

In this codebase, task lifecycle and state management are governed by a strict precedence system and a set of built-in states that track a task from submission to completion. The system is designed to handle both automatic transitions managed by the worker and manual updates triggered by task authors.

Task States and Precedence

Task states are represented by the state class in celery.states, which is a subclass of str. This class implements comparison operators (<, >, etc.) based on a predefined hierarchy. This hierarchy ensures that state transitions follow a logical progression and prevents late-arriving updates from overwriting final results.

Built-in States

The core states defined in celery.states include:

PENDING: The default state for tasks waiting for execution.
STARTED: The task has been picked up by a worker (requires track_started to be enabled).
SUCCESS: The task completed successfully.
FAILURE: The task failed with an exception.
RETRY: The task failed but is being retried.
REVOKED: The task was cancelled or terminated.

Precedence Rules

The PRECEDENCE list in celery/states.py defines the order of importance. A lower index in this list indicates higher precedence:

PRECEDENCE = [
    'SUCCESS',
    'FAILURE',
    None,  # Placeholder for custom states
    'REVOKED',
    'STARTED',
    'RECEIVED',
    'REJECTED',
    'RETRY',
    'PENDING',
]

Custom states (like 'PROGRESS') fall into the None category. This means:

SUCCESS and FAILURE always have the highest precedence.
Custom states have higher precedence than STARTED or PENDING.
A task in a SUCCESS state will ignore any subsequent state updates to prevent race conditions.

Example of state comparison:

from celery.states import state, PENDING, SUCCESS, STARTED

assert state(SUCCESS) > state(STARTED)  # True
assert state('PROGRESS') > state(STARTED)  # True
assert state('PROGRESS') < state(SUCCESS)  # True

The Task Lifecycle

The lifecycle of a task typically follows this flow:

Submission: When apply_async() or delay() is called, the task is created in the PENDING state.
Execution: When a worker starts the task, it may transition to STARTED if the Task.track_started attribute is set to True.
Completion: The task ends in a "ready" state, which is one of SUCCESS, FAILURE, or REVOKED.

Enabling Execution Tracking

By default, Celery does not report when a task has started to reduce backend traffic. To enable this, set track_started on the task class:

class MyTask(Task):
    track_started = True

Manual State Management

Tasks can manually update their state using the update_state method. This is most commonly used for progress tracking in long-running tasks.

Updating Progress

The update_state method accepts a state string and a meta dictionary. The meta data is stored in the result backend and can be retrieved by the client.

# Example from t/unit/tasks/test_tasks.py
@app.task(bind=True)
def long_running_task(self):
    for i in range(100):
        # Perform work...
        self.update_state(
            state='PROGRESS',
            meta={'current': i, 'total': 100}
        )

When update_state is called without a task_id, it defaults to the current task's ID found in self.request.id.

State Persistence Rules

The BaseBackend class in celery/backends/base.py enforces the finality of the SUCCESS state. In the _store_result method, the backend checks the current state before applying an update:

# Logic from celery/backends/base.py
def _store_result(self, task_id, result, state, ...):
    current_meta = self._get_task_meta_for(task_id)

    if current_meta['status'] == states.SUCCESS:
        return result  # Ignore update if already successful

    # Otherwise, proceed with storing the new state
    self._set_with_state(...)

This mechanism is critical for handling network partitioning or worker failures, ensuring that a "lost" worker cannot overwrite a successful result with an older state (like STARTED or a custom PROGRESS state).

Retries and State

When a task is retried via Task.retry(), its state transitions to RETRY. This state is considered "unready," meaning the task is still active in the system. The retry method internally creates a new signature and sends it to the broker, while the current execution instance raises a Retry exception to signal the worker to update the state accordingly.

Task States and Precedence​

Built-in States​

Precedence Rules​

The Task Lifecycle​

Enabling Execution Tracking​

Manual State Management​

Updating Progress​

State Persistence Rules​

Retries and State​