Confirmation Modes & Error Handling
When processing background tasks, exceptions and failures are inevitable. Repid gives you granular control over how errors are handled and how messages are acknowledged back to the message broker.
Confirmation Mode and Errors
Repid allows you to control exactly when a message is acknowledged (ack) back
to the broker using the confirmation_mode setting on the decorator.
The default mode is "auto".
"auto": Acknowledges the message if the actor completes successfully. If an exception is raised, it handles it based on theon_errorpolicy."always_ack": Always acknowledges the message, even if the actor raises an exception."ack_first": Acknowledges the message before the actor even runs (Fire-and-forget)."manual": Never automatically acknowledges. You must inject theMessagedependency and manuallyawait message.ack()."manual_explicit": Same as"manual", but enforces explicitly returning an acknowledgment status.
When using "auto", the on_error policy defaults to "nack". You can change
this to "reject" to simply return the message to the queue, or "ack" to
acknowledge it even on failure:
@router.actor(
channel="tasks",
confirmation_mode="auto",
on_error="reject" # Can be "nack", "reject", or "ack"
)
async def my_actor():
pass
Alternatively, you can provide a callable that inspects the exception dynamically to decide the outcome:
def handle_error(exc: Exception) -> Literal["reject", "nack", "ack"]:
# Reject ValueError so they are re-queued, Nack everything else
if isinstance(exc, ValueError):
return "reject"
return "nack"
@router.actor(channel="tasks", on_error=handle_error)
async def my_smart_actor():
pass
Infinite retries
Ensure your message broker is configured with delivery limits (max retries) to prevent infinite reprocessing of failed messages.
manual_explicit confirmation mode
The "manual_explicit" mode works exactly like "manual", but ensures that you
explicitly return one of the literals "ack", "nack", "reject", or "no_action".
The runner will act on the returned value immediately after execution. This
helps to prevent accidentally leaving a message without a confirmation.
from typing import Literal
@router.actor(channel="tasks", confirmation_mode="manual_explicit")
async def my_explicit_actor() -> Literal["ack", "nack", "reject", "no_action"]:
try:
# Do some work
return "ack"
except Exception:
# You must explicitly return a confirmation action
return "nack"
always_ack confirmation mode
The "always_ack" mode is essentially the same as setting
confirmation_mode="auto" combined with on_error="ack".
Note that when using "always_ack" or "ack_first" modes, you cannot provide
an on_error parameter to the decorator, as the behavior is already strictly
defined by the mode itself.
Manual modes and validation errors
If you are using confirmation_mode="manual" or
confirmation_mode="manual_explicit", you must handle your message's acknowledgment.
However, if a message fails to parse (e.g. pydantic.ValidationError when
reading arguments), your actor code will never be executed.
In manual and manual_explicit modes, the easiest way to avoid leaving the
message unacknowledged is to use on_error (alternatively, you can catch the
exception and act on the message in a middleware).
By default, in manual and manual_explicit modes on_error is set to
"no_action". You can override this to acknowledge or reject the message
directly:
@router.actor(
confirmation_mode="manual_explicit",
on_error="nack"
)
async def my_actor(payload: MyPydanticModel) -> Literal["ack", "nack", "reject", "no_action"]:
# the actor body won't run if MyPydanticModel fails validation
return "ack"
Retries
Repid does not provide built-in retry policies or delay mechanics out of the box. Because retry semantics (e.g., dead-lettering, exponential backoff, delaying messages) are highly dependent on the underlying message broker and the specific needs of the user, Repid delegates this responsibility.
If an actor raises an exception, the worker catches it and instructs the broker
based on the actor's on_error configuration.
If you need to implement retries, you have two main options:
- Broker-Level Retries: Configure your message broker with a Dead Letter Exchange (DLX) to automatically re-route failed messages. You can either leave messages in the DLX for manual inspection and processing, or configure message TTLs to automatically re-queue them after a delay for retry attempts.
- Application-Level Retries: Wrap your actor function with error handling
logic (e.g., using a library like
tenacity, or writing a custom decorator) that catches exceptions, sleeps, and re-attempts the logic internally before ever yielding an error back to Repid. Note that from the broker's perspective, processing of a single message takes much longer, which can lead to worse concurrency.
Poison Message Handling
A "poison message" is a message that consistently fails to process, potentially causing an infinite loop of errors and retries that clogs your queue.
How Repid handles poison messages depends on where the failure occurs: during routing, or during execution.
Unrouted Messages
If a message is pulled from a channel but Repid cannot find a matching actor to route it
to, Repid will initially reject (requeue) the message. However, to prevent infinite
loops, Repid maintains an internal counter of how many times it has seen the same unrouted
message.
If an unrouted message is encountered 10 times (by default), Repid classifies it as a
poison message. It logs an actor.route.poison_message error and will nack the message
instead of requeuing it, allowing the broker to drop it or move it to a Dead Letter Queue
(DLQ).
Actor Execution Failures
Because Repid delegates execution retry logic to the underlying message broker, handling poison messages that crash inside your actor code largely depends on your infrastructure:
- Broker-Level Delivery Limits (Recommended): Most mature brokers support configuring maximum delivery attempts. Once a message exceeds this limit, the broker automatically routes it to a Dead Letter Queue (DLQ) or drops it. This prevents the message from being infinitely re-queued and keeps your application logic clean.
- Application-Level Handling: You can also handle poison messages explicitly within
your actor. By catching exceptions and explicitly acknowledging (
ack) the failing message—perhaps after logging it or saving the payload to a separate database table—you remove it from the queue and stop the retry loop.