# ADR-0005: Use exceptions for task execution failures

**Date**: 2026-05-27
**Status**: accepted
**Deciders**: Perago maintainers

## Context

Perago task functions already use their return value as the business `Result Output` that is written under Conductor `outputData.result` only when the task completes successfully. A user reported that returning a business payload such as `{"status": "FAIL"}` still allowed the next workflow node to run. That behavior matches the current contract, but it exposed an API gap: task authors need a clear way to declare execution failures without overloading business result fields.

Conductor distinguishes retryable `FAILED` task results from non-retryable `FAILED_WITH_TERMINAL_ERROR` task results. It also supports `outputData`, but failed Perago task results intentionally do not expose business output to downstream nodes.

## Decision

Perago uses task return values for successful business results and exceptions for task execution failures.

Task authors should raise `TaskFailed("...")` for execution failures where retrying the same input may succeed. Perago reports those attempts as Conductor `FAILED`.

Task authors should raise `TaskTerminalError("...")` for detectable execution failures where retrying the same input has no value. Perago reports those attempts as Conductor `FAILED_WITH_TERMINAL_ERROR`.

Business-recoverable outcomes that should be handled by workflow logic, such as prompt policy rejection or missing user-provided information, remain successful task results. The task returns a structured `Result Output`, and WorkflowDef branching handles the business state.

Failure reasons are strings. Perago caps the text written to Conductor `reasonForIncompletion` with a configurable `PERAGO_FAILURE_REASON_MAX_LENGTH` limit and records truncation details in worker JSONL logs rather than putting structured JSON into failed task output.

## Alternatives Considered

### Alternative 1: Return a failure object from the task function

- **Pros**: Lets task authors express success and failure without Python exceptions.
- **Cons**: Makes the return annotation represent both success data and runtime control flow, complicates Pydantic output models and TaskDef output schemas, and conflicts with the existing `Result Output` glossary.
- **Why not**: Perago keeps success data and runtime failure control separate. `return Output(...)` means the task completed.

### Alternative 2: Treat business status fields such as `status="FAIL"` as Conductor failures

- **Pros**: Matches some business payload conventions.
- **Cons**: Requires Perago to understand arbitrary business schemas, makes common fields like `status` reserved or ambiguous, and breaks typed task contracts.
- **Why not**: Business schemas belong to the task author and workflow. Perago should not infer Conductor lifecycle state from business result fields.

### Alternative 3: Put structured JSON in failed `outputData`

- **Pros**: Could carry machine-readable failure metadata.
- **Cons**: Failed tasks normally do not feed downstream business nodes, Conductor `reasonForIncompletion` is the primary operator-facing field, and Perago would need another schema contract for failed outputs.
- **Why not**: The MVP only needs a reason string for Conductor failure state. Structured business recovery data belongs in successful `Result Output` and workflow branches.

## Consequences

### Positive

- Task authors have explicit APIs for retryable and terminal execution failures.
- `Result Output` remains a successful business result instead of a union of success data and runtime control signals.
- WorkflowDef branching remains responsible for business-recoverable outcomes.
- Workspace publication stays fail-closed: failed task attempts do not stage or publish local workspace changes.

### Negative

- Task authors must choose between business branch output and execution failure exceptions.
- Terminal errors become a public API concept and documentation must explain when automatic retry is inappropriate.
- The runtime must preserve ordinary unhandled exceptions as retryable `FAILED` results while mapping explicit terminal errors separately.

### Risks

- **Risk**: Authors may overuse `TaskTerminalError` for cases that product workflow should recover from.
  **Mitigation**: Documentation uses the failure quadrant and examples to separate business branches from execution failures.
- **Risk**: Authors may expect structured failed outputs.
  **Mitigation**: Perago keeps failed results to `status` plus `reasonForIncompletion`; structured state should be returned only for successful business branch outputs.