laravel ai agent human in the loop

Building a Laravel AI Agent with Human-in-the-Loop Approval

A Laravel AI agent that can process refunds, delete records, or send external emails is making irreversible decisions on your behalf. The model is probabilistic. Its tool parameters derive from natural language interpretation. At real volume, a Laravel AI Agent Human-in-the-loop (HITL) approval layer is not a limitation on what agents can do, it is the architectural pattern that makes automation safe enough to run against live business data. One misread parameter on a financial operation is a support incident. One misinterpreted delete is a data loss event.

This article is part of the Module 1 AI Architecture track, which covers the full spectrum of production agent design in Laravel. The governance patterns here align directly with the contracts and telemetry framework described in the production-grade AI architecture guide for Laravel. If you have already built tool-calling agents with Prism PHP in Laravel, this is the approval layer that makes those agents safe to ship.

Why autonomous agents are a production risk

The Prism PHP guide flags a specific production pitfall: an LLM that can trigger a live Stripe refund without a confirmation step is a liability, not a feature. The failure mode is not dramatic. It is subtle. An agent misreads “refund the last order” as applying to the wrong customer record. A tool parameter is slightly off because the user’s phrasing was ambiguous. A retry loop fires twice because a timeout was mishandled upstream. Each of these is recoverable in isolation. At scale, they compound into a support backlog and, in the financial case, chargeback exposure.

The architecture answer is not to restrict what agents can do. It is to separate the decision phase from the execution phase. Let the model determine what action to take. Then route that decision through a human review step before anything irreversible happens. The agent’s capability is unchanged. The execution path now has a gate.

[Production Pitfall] At teams running agentic workflows against payment processors, the gap between the model’s stated confidence and its actual accuracy is the number to measure before going live. An LLM returning process_refund with high apparent confidence is still wrong at a measurable rate under real-world input distribution. Measure it on a labelled sample before removing the gate.

Three human-in-the-loop patterns

Choose based on the risk profile of the operation your agent is automating.

PatternRisk levelHuman involvementAppropriate use cases
Confirmation gateHighPre-executionFinancial transactions, deletions, external communications
Audit trail with rollbackMediumPost-executionBulk classification, data enrichment, lower-stakes updates
Confidence threshold routingVariableConditionalSupport ticket routing, content tagging, classification pipelines

The confirmation gate has the widest applicability and the clearest correctness guarantee: nothing executes until a human says it should. The confidence threshold pattern appears later as a lighter alternative for workflows where full pre-execution review adds too much latency.

Building the human-in-the-loop confirmation gate

Confirmation gate: Laravel AI agent human-in-the-loop approval flow Flowchart showing how a Prism PHP agent intercepts a tool call, creates a PendingApproval record, notifies an operator, then branches on operator decision — approved actions run via a queued job while rejected actions return a signal to the agent. User sends a request Agent (Prism PHP) Selects the right tool Tool call intercepted Returns without executing PendingApproval::create Stores serialised params Operator notified Event dispatched Operator reviews Reject Approve Action rejected Returned to agent Agent responds Job dispatched Tool executed Actual action performed Agent receives result Conversation resumes Start / decision Agent Laravel system

The flow has four stages: the agent proposes an action, the application creates a pending approval record, an operator approves or rejects it, and on approval a queued job executes the actual operation. The agent’s holding response, not the tool outcome, is what the user sees immediately. Closing the loop with the user post-execution is a separate concern covered in the job section below.

The PendingApproval migration

Model all state transitions in the enum up front. Adding expired now avoids a schema migration the moment the expiry command is introduced.

Schema::create('pending_approvals', function (Blueprint $table) {
    $table->id();
    $table->foreignId('user_id')->constrained()->cascadeOnDelete();
    $table->string('tool_name');
    $table->json('tool_parameters');
    $table->string('agent_reasoning')->nullable();
    $table->enum('status', ['pending', 'approved', 'rejected', 'expired'])
          ->default('pending');
    $table->timestamp('expires_at');
    $table->timestamp('resolved_at')->nullable();
    $table->timestamps();

    $table->index(['status', 'expires_at']);
});

The composite index on status and expires_at is not optional. The expiry command queries both columns on every scheduled run. Without it, that query becomes a full table scan as the approvals backlog grows.

Wiring the approval gate into Prism PHP

The tool closure does not execute the action. It creates the pending approval record and returns a holding response to the agent. Any operator notification logic (Slack webhook, email, Reverb broadcast), belongs inside the ApprovalRequested listener, not the closure itself. Keep the closure thin.

use EchoLabs\Prism\Facades\Tool;
use App\Events\ApprovalRequested;
use App\Models\PendingApproval;

// $userId is resolved from the authenticated session before the agent call.
$refundTool = Tool::as('process_refund')
    ->for(
        'Propose a customer order refund. '
        . 'All refunds require human approval before processing.'
    )
    ->withStringParameter('email', 'Customer email address.')
    ->withStringParameter('order_id', 'The order UUID to refund.')
    ->withStringParameter('reason', 'Reason for the refund.')
    ->using(function (string $email, string $order_id, string $reason) use ($userId): string {
        $approval = PendingApproval::create([
            'user_id'         => $userId,
            'tool_name'       => 'process_refund',
            'tool_parameters' => [
                'email'    => $email,
                'order_id' => $order_id,
                'reason'   => $reason,
            ],
            'agent_reasoning' => $reason,
            'expires_at'      => now()->addHours(24),
        ]);

        event(new ApprovalRequested($approval));

        return "Refund proposal submitted for review (ID: {$approval->id}). "
             . "An operator will confirm before any funds are moved.";
    });

[Architect’s Note] The tool description is a prompt. Write it precisely. “Propose a refund” communicates intent without overpromising execution. If the description says “Process a refund,” the model may treat the holding response as confirmation that the refund completed, which produces incorrect downstream reasoning.

Pair this gate with schema validation on tool parameters before the record is written. An approval gate is not useful if the order_id stored in the pending record is already a hallucinated value. Validate inputs at the closure boundary first.

You can also apply token tracking middleware at the outer agent call to log the inference cost of proposals that operators subsequently reject. A high rejection rate on a specific tool is a signal that the tool description is ambiguous and needs tightening, not that the gate should be removed.

The ApprovalController

The controller handles the approve and reject actions. Status is set atomically before the job is dispatched, so the job can guard against double-execution by checking whether the record is still in the expected state.

namespace App\Http\Controllers;

use App\Jobs\ExecuteApprovedToolJob;
use App\Models\PendingApproval;
use Illuminate\Http\JsonResponse;
use Illuminate\Support\Facades\DB;

class ApprovalController extends Controller
{
    public function approve(PendingApproval $approval): JsonResponse
    {
        DB::transaction(function () use ($approval): void {
            $locked = PendingApproval::lockForUpdate()->findOrFail($approval->id);

            abort_if($locked->status !== 'pending', 409, 'Approval already resolved.');
            abort_if($locked->expires_at->isPast(), 410, 'Approval has expired.');

            $locked->update([
                'status'      => 'approved',
                'resolved_at' => now(),
            ]);

            ExecuteApprovedToolJob::dispatch($locked);
        });

        return response()->json(['message' => 'Approved. Queued for execution.']);
    }

    public function reject(PendingApproval $approval): JsonResponse
    {
        DB::transaction(function () use ($approval): void {
            $locked = PendingApproval::lockForUpdate()->findOrFail($approval->id);

            abort_if($locked->status !== 'pending', 409, 'Approval already resolved.');

            $locked->update([
                'status'      => 'rejected',
                'resolved_at' => now(),
            ]);
        });

        return response()->json(['message' => 'Rejected.']);
    }
}

[Edge Case Alert] The pessimistic lock via lockForUpdate() inside the transaction prevents two operators from approving the same record simultaneously and dispatching the job twice. Without it, concurrent requests pass the status !== 'pending' check at the same instant and both proceed. This is not a theoretical edge case on any approval UI that shows a list of pending items to multiple operators.

The ExecuteApprovedToolJob

Each supported tool_name gets its own branch in the match expression. Unknown names throw rather than silently no-op. The status guard at the top of handle() protects against the (unlikely but possible) case where the job is retried after the record has already been marked resolved by a separate path.

namespace App\Jobs;

use App\Models\PendingApproval;
use App\Services\RefundService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;

class ExecuteApprovedToolJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $tries   = 3;
    public int $backoff = 30;

    public function __construct(
        public readonly PendingApproval $approval
    ) {}

    public function handle(RefundService $refundService): void
    {
        if ($this->approval->fresh()?->status !== 'approved') {
            return;
        }

        try {
            match ($this->approval->tool_name) {
                'process_refund' => $refundService->process(
                    email:   $this->approval->tool_parameters['email'],
                    orderId: $this->approval->tool_parameters['order_id'],
                    reason:  $this->approval->tool_parameters['reason'],
                ),
                default => throw new \UnexpectedValueException(
                    "No executor registered for tool: {$this->approval->tool_name}"
                ),
            };
        } catch (\Throwable $e) {
            report($e);
            $this->fail($e);
        }
    }
}

Note the use of $this->approval->fresh() rather than the in-memory model. By the time the job runs, the serialised model may be stale. Fetching a fresh copy from the database ensures the status check reflects reality.

[Production Pitfall] Register this job on a dedicated approvals queue, not default. If your default queue is saturated with inference jobs, an operator-approved action should not wait behind a backlog of unrelated work. Configure the queue priority in Laravel Horizon for AI queue workloads using a higher balance weight for the approvals queue.

[Architect’s Note] The job executes the tool and stops. The agent’s conversation turn ended when it returned the holding response. If your product requires the agent to close the loop with the user after approval: “Your refund has been processed”, the job should dispatch a follow-up event or write to a record the UI subscribes to via a Reverb broadcast or polling endpoint. This is a product decision, not an architecture flaw. Design for it explicitly rather than discovering it after the gate is built.

Expiry handling

Approvals without a TTL are a support liability. An operator who never sees a notification leaves a pending record open indefinitely. The scheduled command handles cleanup. In Laravel 13, schedules are registered in routes/console.php, not a Kernel class.

// routes/console.php
use Illuminate\Support\Facades\Schedule;

Schedule::command('approvals:expire')->hourly();
// app/Console/Commands/ExpirePendingApprovalsCommand.php
namespace App\Console\Commands;

use App\Models\PendingApproval;
use Illuminate\Console\Command;

class ExpirePendingApprovalsCommand extends Command
{
    protected $signature   = 'approvals:expire';
    protected $description = 'Expire pending approvals that have passed their TTL.';

    public function handle(): void
    {
        $count = PendingApproval::where('status', 'pending')
            ->where('expires_at', '<', now())
            ->update([
                'status'      => 'expired',
                'resolved_at' => now(),
            ]);

        $this->info("Expired {$count} pending approval(s).");
    }
}

The bulk update() avoids the N+1 write overhead of hydrating Eloquent models one by one. On expiry, dispatch an ApprovalExpired event if the original user needs to be notified that their requested action did not complete within the window. See the Laravel scheduling documentation for schedule frequency options and overlap prevention.

The confidence threshold pattern

Not every agent action warrants a full pre-execution review. For classification and routing workflows, a confidence score from the model can route high-confidence decisions automatically and send low-confidence decisions for human review. The PendingApproval table handles both patterns without schema changes.

$classifyTool = Tool::as('classify_support_ticket')
    ->for(
        'Classify a support ticket by category. '
        . 'High-confidence classifications are applied immediately. '
        . 'Low-confidence classifications are queued for human review.'
    )
    ->withStringParameter('ticket_id', 'The ticket UUID.')
    ->withStringParameter('category', 'Inferred category.')
    ->withNumberParameter('confidence', 'Confidence score from 0.0 to 1.0.')
    ->using(function (string $ticket_id, string $category, float $confidence) use ($userId): string {
        if ($confidence >= 0.85) {
            $ticket = SupportTicket::find($ticket_id);

            if (! $ticket) {
                return "Ticket {$ticket_id} not found.";
            }

            $ticket->update(['category' => $category]);

            return "Classified as {$category} (confidence: {$confidence}).";
        }

        PendingApproval::create([
            'user_id'         => $userId,
            'tool_name'       => 'classify_support_ticket',
            'tool_parameters' => compact('ticket_id', 'category', 'confidence'),
            'agent_reasoning' => "Low confidence classification: {$category} ({$confidence})",
            'expires_at'      => now()->addHours(48),
        ]);

        return "Classification queued for review (confidence {$confidence} is below threshold).";
    });

[Word to the Wise] The 0.85 threshold is a starting point. Calibrate it against a labelled sample of your actual input distribution before treating it as fixed. A threshold too high sends everything to review and defeats the automation. A threshold too low lets misclassifications through. Measure your error rate at several thresholds on real data and set the value deliberately.

The Laravel AI integration architecture guide is relevant here: confidence score format and range vary by provider and model. Wrap provider-specific confidence extraction behind a shared interface so the threshold logic does not need to change when you switch from one model to another.

Using the audit trail for compliance reporting

The pending_approvals table is an audit log by design. Every proposed action is on record: what the agent wanted to do, the parameters it supplied, the operator’s decision, and the timestamps. No separate audit infrastructure is needed.

Query patterns for compliance and operational reporting:

// Approvals pending more than 4 hours — potential notification delivery failure
PendingApproval::where('status', 'pending')
    ->where('created_at', '<', now()->subHours(4))
    ->with('user')
    ->get();

// Rejection rate by tool — identifies imprecise tool descriptions
PendingApproval::selectRaw(
    'tool_name, COUNT(*) as total, SUM(status = "rejected") as rejections'
)
->whereIn('status', ['approved', 'rejected'])
->groupBy('tool_name')
->get();

// Average time to resolution per tool — identifies operator workflow bottlenecks
PendingApproval::selectRaw(
    'tool_name, AVG(TIMESTAMPDIFF(SECOND, created_at, resolved_at)) as avg_seconds'
)
->whereNotNull('resolved_at')
->groupBy('tool_name')
->get();

A high rejection rate on a specific tool is the signal that its description is imprecise and the model is misapplying it. Fix the description before turning off the gate. The audit data tells you when you can trust the agent enough to raise the confidence threshold or reduce the review scope, and that is the data-driven path to increasing automation over time.

[Efficiency Gain] If approval volume grows to the point where it affects query performance on the live table, move resolved records (approved, rejected, expired) to an archive table on a nightly schedule. The hot table stays small. Operational queries remain fast. See the Laravel queuing documentation for dispatching the archival job reliably.

What comes next

The implementation here gives you a working confirmation gate, a pessimistic-lock controller, a retryable approval job, an expiry command, and a confidence threshold variant that reuses the same schema. The natural extensions are connecting the ApprovalRequested event to a real notification channel, building the operator UI that renders pending approvals with Approve and Reject controls, and extending the ExecuteApprovedToolJob match expression as new tools are added to the agent.

Build the gate first. Reach for observability on top of it once it is running in production.


Frequently Asked Questions

How do I handle multiple tool types in a single approval system?

Extend the match expression in ExecuteApprovedToolJob with a branch for each tool name. Inject the relevant service via the handle() method signature for each. The tool_name column is the dispatch key. If the number of tools grows large, extract the match logic into a dedicated ToolExecutorRegistry class that maps tool names to executor callables, and resolve it from the Service Container.

What happens if the agent needs the tool result to complete its response?

The agent’s turn ends when it returns the holding response. It does not wait for the job. If your product requires the agent to surface the final outcome to the user — “Your refund has been processed” — the ExecuteApprovedToolJob must trigger a downstream action after execution: a database write the frontend polls, a Reverb broadcast the UI subscribes to, or a follow-up notification. Design this requirement explicitly before building the operator UI.

How should I notify operators about pending approvals?

Dispatch the channel-specific notification from inside the ApprovalRequested event listener. Keep the tool closure agnostic to the notification channel. Common choices are a Slack webhook (fast, visible), a Filament admin dashboard that polls or broadcasts pending counts, or a Reverb broadcast to a dedicated operator view. The channel is a deployment decision and should not couple to the gate logic itself.

Can I set different TTLs for different tool types?

Yes. The expires_at value is set per record inside the tool closure. Pass the TTL as a configuration value keyed by tool name rather than hardcoding it. A destructive operation might warrant a 2-hour window. A classification review might be comfortable with 48 hours. Externalise the TTL map to config/approvals.php so it can be adjusted without touching the closure code.

How do I prevent the approvals table from growing indefinitely?

The approvals:expire command resolves stale pending records. For completed records (approved, rejected, expired), add a scheduled archival command that moves rows older than your retention window to an pending_approvals_archive table. Alternatively, soft-delete resolved records and run a pruning command using Laravel’s built-in MassPrunable trait on the model.

Dewald Hugo

A software architect with 15+ years of experience in the PHP and Laravel ecosystem. Dewald created Origin Main to provide the engineering rigour required to integrate AI into professional, high-concurrency production systems. He writes for developers who care less about "getting it to work" and more about "getting it to last".

Subscribe
Notify of
0 Comments
Oldest
Newest Most Voted
Quick Navigation
Scroll to Top