Laravel AI Governance Layer: Beyond a Config File

Q: What happens when a hard budget limit is exceeded mid-request?

A BudgetExceededException is thrown before the provider is called. No tokens are consumed and no cost is incurred, so it's safe to treat this the same way you'd treat any other authorization failure.

Q: Can budgets be scoped to a single feature instead of a whole user?

Yes. The scope parameter on both setAiBudget() and the executor's run() call lets you track and enforce limits per feature, so a summarize budget and a chat budget on the same user are independent.

🕒 6 Minute Read 📅 Date Published: July 4, 2026

You got AI working in your Laravel app. The prompt runs, the response streams back, the demo looked great in standup. None of that tells you what happens in month three, when someone edits a prompt directly in a hosted dashboard and nobody notices until the output quality drops. A Laravel AI governance layer exists to answer the question that shows up after the demo: does this integration have a floor, or is it held together by the fact that nothing has gone wrong yet? This sits underneath the architectural thinking laid out in our production-grade AI architecture guide, and it’s part of the broader AI architecture module we maintain here.

Three things tend to go wrong once an AI feature reaches real users, and none of them look like bugs at first.

Prompt Drift Is a Deployment Problem, Not a Config Problem

Most Laravel teams start prompt iteration the same way: a string in a controller, then a config file, then eventually a database row someone edits through an admin panel because that felt more flexible. The problem isn’t the flexibility. It’s that none of those layers give you the thing every other part of your stack already has: a diff, a rollback, and an environment boundary.

Treat a prompt the way you treat a schema. A prompt is a versioned artifact that ships with your code, not a runtime value someone can quietly change in production. AI Governor’s approach is to generate prompt definitions as PHP files under version control:

php artisan make:prompt SummarizeArticle

That command produces a timestamped class in app/Prompts with the system message, the user template, the model, and the sampling parameters all declared together, not scattered across a config array and a database seed. A sync command then writes the definition to the database deterministically, scoped to the current environment, so staging and production never accidentally share a prompt version.

php artisan ai:prompts:sync --dry-run
php artisan ai:prompts:sync

[Architect’s Note] The dry-run flag matters more than it looks. Running sync blind in a deploy script means the first time you see a diff is after it’s live. Preview it, then commit to it, the same discipline you’d apply to a migration.

Rollback follows the same logic as any other versioned artifact: revert the file in Git, redeploy, run sync again. No manual database edits, no “who changed this and when” archaeology three weeks later. This is the same argument we make in more depth in our piece on treating prompts as version-controlled migrations, and it’s worth reading if prompt drift is the part of this article that stung.

Token Spend Without Pre-Dispatch Enforcement Is a Liability, Not a Cost

Here’s the pattern that catches teams off guard: token usage gets logged after the API call returns, dashboards get built on top of that log, and everyone feels informed. Then a runaway loop, a compromised API key, or a single user hammering an expensive endpoint blows through a month’s budget in an afternoon, and the logging did nothing to stop it. Observability after the fact is accounting. It is not a safeguard.

The distinction that matters is pre-dispatch enforcement: checking the budget before the provider is called, not after. Any Eloquent model, a User, a Team, a Tenant, can carry a token budget:

use AiGovernor\Traits\HasAiBudget;

class User extends Authenticatable
{
    use HasAiBudget;
}

// Hard limit: throws before the provider is called
$user->setAiBudget(limit: 100_000, period: 'monthly');

// Soft limit: logs a warning, does not block
$user->setAiBudget(limit: 10_000, period: 'daily', hard: false);

Passing the owning model into the executor means the check happens as a precondition, not a side effect:

try {
    $result = app(GovernedExecutor::class)->run(
        prompt:    $prompt,
        variables: ['text' => $article->body],
        owner:     $user,
        scope:     'summarize',
    );
} catch (BudgetExceededException $e) {
    // No tokens consumed, no cost incurred.
    // Handle as you would any other authorization failure.
    report($e);
    return response()->json(['message' => 'AI budget exhausted.'], 429);
}

That’s the whole difference. A BudgetExceededException fires before the HTTP call to OpenAI or Anthropic ever leaves your infrastructure. Compare that to a middleware layer that only tracks usage after the fact, which is a legitimate and complementary pattern we cover separately in our guide to AI middleware for token tracking and rate limiting, but tracking and enforcement solve different problems.

[Production Pitfall] Watch the boundary if you’re wrapping Prism PHP’s agentic loops rather than a single call. A tool-calling loop can make several provider round trips per user turn, and a budget check that only fires once per run() invocation misses every intermediate call inside that loop.

Provider Switching Without Abstraction Is Technical Debt From Day One

If your controller calls the OpenAI SDK directly, switching providers means touching every file that made that call. That’s not a hypothetical, it’s the default outcome of skipping an abstraction layer on day one, and it’s the kind of debt that’s invisible until a pricing change or an outage forces the migration under pressure.

The fix is a single contract that provider calls go through, so switching is a config change:

// config/ai-governor.php
'adapter' => \AiGovernor\Adapters\AnthropicAdapter::class,

class MyCustomAdapter implements AiProviderAdapter
{
    public function execute(PromptVersion $prompt, string $rendered): AdapterResult
    {
        // Call your provider, return a normalised result.
        // Throw ProviderException for semantic failures.
        // Throw Illuminate\Http\Client\RequestException for network failures.
        return new AdapterResult(
            content:          $responseBody,
            promptTokens:     $usage['input'],
            completionTokens: $usage['output'],
            latencyMs:        $latencyMs,
            model:            $prompt->model,
        );
    }
}

[Word to the Wise] This only pays off if you actually keep application code calling the executor, not the adapter directly. The first time someone reaches past the abstraction “just for this one endpoint,” the abstraction stops protecting you.

Where This Sits Next to Prism PHP and laravel/ai

This is worth addressing directly, because it’s a fair question if you land on the package README without this context: is a governance layer competing with Prism PHP?

No, and the distinction is scope, not preference. Laravel 13’s first-party laravel/ai SDK and Prism PHP’s agentic extensions both answer “how do I call a model, chain tool use, or stream a response.” A governance layer answers a different question: “given that a call is about to happen, should it, and under what version of the prompt?” The adapter contract shown above doesn’t care whether the underlying call came from a raw SDK client or a Prism tool-calling loop; it just needs something on the other side implementing execute(). That means a governance layer sits in front of either, as a policy checkpoint, not as a competing execution engine. The same separation of concerns applies whether you’re comparing OpenAI, Gemini, and Claude integration patterns broadly, which we cover in our production-ready architecture guide across all three providers, or building the provider-agnostic service layer this governance model assumes, detailed in our guide to a provider-agnostic AI service layer.

[Edge Case Alert] Anthropic’s Messages API expects an explicit anthropic-version header. AI Governor defaults this to 2023-06-01, which is still the correct default as of this writing, but pin it explicitly in your .env if you’re building a custom adapter. Silent version drift on an HTTP header is exactly the kind of thing a governance layer is supposed to catch, not cause.

A Minimal Governed Prompt, Start to Finish

composer require dewaldhugo/laravel-ai-governor
php artisan vendor:publish --tag=ai-governor-config
php artisan migrate

OPENAI_API_KEY=sk-...

With the package installed, generate and sync a prompt, then run it through the executor with an owner attached for enforcement:

$prompt = PromptVersion::resolve('summarize');

try {
    $result = app(GovernedExecutor::class)->run(
        prompt:    $prompt,
        variables: ['text' => $article->body],
        owner:     $user,
        scope:     'summarize',
    );

    Log::info('AI call completed', [
        'tokens'  => $result->totalTokens(),
        'latency' => $result->latencyMs,
    ]);
} catch (BudgetExceededException $e) {
    return response()->json(['message' => 'Budget exhausted.'], 429);
} catch (ProviderException $e) {
    report($e);
    return response()->json(['message' => 'AI provider error.'], 502);
}

[Efficiency Gain] totalTokens() and latencyMs are the observability half of this article’s title. To be precise about what that means here: this is call-level telemetry, prompt version, tokens, latency, not distributed tracing or a full APM integration. If you need span-level tracing across a multi-step agentic workflow, that’s a separate concern layered on top, not something this package claims to replace.

For route-level protection without touching controller logic, register the alias in bootstrap/app.php, consistent with Laravel 13’s middleware configuration:

->withMiddleware(function (Middleware $middleware) {
    $middleware->alias([
        'ai.budget' => \AiGovernor\Http\Middleware\EnforceAiBudget::class,
    ]);
})

Route::post('/summarize', SummarizeController::class)
     ->middleware(['auth:sanctum', 'ai.budget:summarize']);

Order matters here. The budget middleware passes unauthenticated requests through without a check, so it needs to run after an auth middleware in the stack, not before.

What This Actually Buys You

None of this replaces good judgment about which model to call or how to structure a prompt. What it changes is what happens when something goes wrong: a bad prompt edit is a Git revert instead of a production incident, a runaway loop hits a 429 instead of a bill, and a provider outage is a config change instead of a rewrite. If you’re running governed calls through queued jobs at volume, worth pairing this with the queue reliability patterns in our guide to running Horizon under AI workloads, since a budget exception inside a retried job needs the same care as any other domain exception you don’t want silently swallowed by a retry loop.

Frequently Asked Questions

Does this replace Prism PHP or the laravel/ai SDK?

No. It’s a policy layer that sits in front of whichever execution layer you’re using. Prism and laravel/ai handle the call itself, including tool use and streaming; the governance layer decides whether that call should happen and under which prompt version.

What happens when a hard budget limit is exceeded mid-request?

A BudgetExceededException is thrown before the provider is called. No tokens are consumed and no cost is incurred, so it’s safe to treat this the same way you’d treat any other authorization failure.

Can budgets be scoped to a single feature instead of a whole user?

Yes. The scope parameter on both setAiBudget() and the executor’s run() call lets you track and enforce limits per feature, so a summarize budget and a chat budget on the same user are independent.

Does this support Gemini today?

Not yet. Shipped adapters cover OpenAI and Anthropic. A custom adapter implementing the AiProviderAdapter contract is the path to Gemini support in the meantime.

How do prompt rollbacks actually work in production?

Revert the prompt definition file in Git, redeploy, and run the sync command again. There’s no manual database editing involved, which is the entire point of treating prompts as versioned code rather than runtime config.

Dewald Hugo

A software architect with 15+ years of experience in the PHP and Laravel ecosystem. Dewald created Origin Main to provide the engineering rigour required to integrate AI into professional, high-concurrency production systems. He writes for developers who care less about "getting it to work" and more about "getting it to last".

Laravel AI Observability: Why Your AI Integration Needs a Governance Layer