Event Debugging with Dumper
To debug Celery events in real-time, you can use the Dumper utility to pipe raw event data to standard output or a custom stream. This provides a "tcpdump-like" view of everything happening in your Celery cluster.
Debugging via the Command Line
The most common way to use the dumper is through the Celery CLI. This starts a process that captures all events and prints them to the console.
celery events --dump
This command invokes _run_evdump in celery/bin/events.py, which calls the high-level evdump function.
Programmatic Event Capture
You can start the event dumper programmatically within your own scripts using the evdump function from celery.events.dumper.
from celery import Celery
from celery.events.dumper import evdump
app = Celery('my_app')
# This will block and print events to sys.stdout
evdump(app=app)
The evdump function handles connection management, including automatic reconnection logic if the broker connection is lost.
Using the Dumper Class for Custom Streams
If you need to redirect event output to something other than stdout (e.g., a file or a memory buffer), use the Dumper class directly.
import io
from celery.events.dumper import Dumper
# Capture events into a string buffer
output_buffer = io.StringIO()
dumper = Dumper(out=output_buffer)
# Example event dictionary
event = {
'hostname': 'worker1.example.com',
'timestamp': 1704110400.0,
'type': 'worker-online',
'sw_ident': 'py-celery',
'sw_ver': '5.3.0',
}
# Process the event
dumper.on_event(event)
print(output_buffer.getvalue())
# Output: worker1.example.com [2024-01-01 12:00:00+00:00] started: sw_ident=py-celery, sw_ver=5.3.0
How Event Formatting Works
The Dumper performs several transformations to make raw events human-readable:
- Type Humanization: Internal event types are mapped to friendly names via
humanize_type. For example,worker-onlinebecomesstartedandworker-offlinebecomesshutdown. - Task Metadata Tracking: The dumper uses a global
LRUCachenamedTASK_NAMES(defined incelery/events/dumper.pywith a limit of 4095 entries) to remember task names and arguments.- When a
task-receivedortask-sentevent arrives, the dumper stores the task name, UUID, args, and kwargs in the cache. - Subsequent events for that task (like
task-startedortask-succeeded) retrieve this metadata from the cache so the output remains descriptive.
- When a
- Automatic Flushing: The
Dumper.saymethod callsself.out.flush()after every message. This ensures that if you pipe the output to another utility (likegrep), the data appears immediately.
Troubleshooting and Limitations
Missing Task Names and Arguments
If you see task events that only show the UUID without the task name or arguments, it is usually due to one of two reasons:
- Cache Eviction: The
TASK_NAMEScache has reached its 4095-entry limit, and the metadata for that task was evicted. - Missed Events: The dumper was not running when the
task-receivedortask-sentevent occurred, so it never captured the metadata.
In-place Dictionary Modification
The Dumper.on_event method is destructive. It uses .pop() to extract fields like timestamp, type, and hostname from the event dictionary.
# WARNING: This will modify the 'event' dictionary
dumper.on_event(event)
# 'type' and 'timestamp' are now missing from 'event'
assert 'type' not in event
If you need to use the event dictionary after passing it to the dumper, pass a copy instead: dumper.on_event(event.copy()).
Connection Errors
When using evdump, connection errors are caught and reported to the output stream. The utility will attempt to reconnect indefinitely, using humanize_seconds to display the retry interval.