Most “getting started with AI” guides hand you a .env file and a curl command and call it a day. That works for a proof of concept. It falls apart the moment you hit an async job, a streaming response, a PHP version mismatch between your laptop and your Forge server, or a rate limit you never planned for.

This guide builds a Laravel AI development stack that actually reflects production. What you test locally is what runs when you push. No surprises at 2am.

What You Need to Build a Laravel AI Development Stack

This walkthrough assumes macOS with Laravel Herd installed. If you’re still on Valet or a Docker-based local setup, Herd is worth the switch. It’s faster, zero-config, and gives you per-project PHP version switching — which matters specifically for AI work, as you’ll see in the next section.

You’ll also need:

PHP 8.3 or 8.4 (latest stable)
Composer 2.x
A Laravel 12 project
API keys from Anthropic and/or OpenAI
A Laravel Forge account if you’re deploying to a managed VPS

PHP Version Parity: The Silent Killer

Here’s the failure mode nobody puts in their tutorial. You build locally on PHP 8.4, install openai-php/laravel or anthropics/anthropic-sdk-php, and everything runs clean. You deploy to a Forge server provisioned at PHP 8.2. The package installs fine. Then a subtle behavioural difference in named arguments, fibers, or enum handling surfaces under production load — and it’s invisible in your test suite because your test suite ran against 8.4. For the full production environment — Nginx config, Supervisor workers, OPcache, and zero-downtime deployments — the guide to deploying Laravel to production is the companion piece to this one.

Herd eliminates this. Right-click any site in Herd’s UI and switch the PHP version per project. Match it to whatever your Forge server is running. No Nginx config edits. No Docker context switching.

To check your server’s PHP version: Forge dashboard → your server → PHP tab. Set that version locally before you touch composer require.

Creating the Project

composer create-project laravel/laravel my-ai-app
cd my-ai-app

If you’re using Herd, the site is available at http://my-ai-app.test automatically once the project lands inside your Herd sites directory (~/Herd by default).

Installing AI Packages

For Claude (Anthropic):

composer require anthropics/anthropic-sdk-php

For OpenAI:

composer require openai-php/laravel
php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"

[Word to the Wise] Publishing the openai-php/laravel config generates config/openai.php, which reads from OPENAI_API_KEY directly. If you also build a custom config/ai.php (recommended below), you now have two config files referencing the same credential. That’s not a problem — until someone rotates the key and only updates one of them. Pick one as the canonical source and have the other reference it via config('ai.openai.api_key'), not the env variable directly.

Both packages integrate with Laravel’s Service Container cleanly. You inject clients through the container — you don’t instantiate them by hand.

Environment Configuration

Be explicit in your .env. Vague key names cause confusion on teams and make secret rotation harder than it needs to be.

env

# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_DEFAULT_MODEL=claude-sonnet-4-6
ANTHROPIC_MAX_TOKENS=8192

# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_DEFAULT_MODEL=gpt-4o
OPENAI_MAX_TOKENS=4096

# AI behaviour
AI_TIMEOUT=30
AI_RETRY_ATTEMPTS=3

Centralise all of this in config/ai.php:

<?php

return [
    'anthropic' => [
        'api_key'    => env('ANTHROPIC_API_KEY'),
        'model'      => env('ANTHROPIC_DEFAULT_MODEL', 'claude-sonnet-4-6'),
        'max_tokens' => (int) env('ANTHROPIC_MAX_TOKENS', 8192),
    ],
    'openai' => [
        'api_key'    => env('OPENAI_API_KEY'),
        'model'      => env('OPENAI_DEFAULT_MODEL', 'gpt-4o'),
        'max_tokens' => (int) env('OPENAI_MAX_TOKENS', 4096),
    ],
    'timeout'        => (int) env('AI_TIMEOUT', 30),
    'retry_attempts' => (int) env('AI_RETRY_ATTEMPTS', 3),
];

Reference model strings from config, never hardcode them in service classes. Anthropic ships new model versions regularly. When they do, you change one line in .env and redeploy — you don’t grep through service classes hoping you caught every instance.

Service Architecture for Your Laravel AI Development Stack

Don’t let AI calls bleed into controllers. A dedicated service class per provider decouples your application logic from API specifics and keeps testing tractable.

<?php

namespace App\Services\AI;

use Anthropic\Client;
use Anthropic\Exceptions\ErrorException;
use Illuminate\Support\Facades\Log;
use RuntimeException;

class ClaudeService
{
    public function __construct(
        private readonly Client $client
    ) {}

    public function complete(string $prompt, array $options = []): string
    {
        try {
            $response = $this->client->messages()->create([
                'model'      => config('ai.anthropic.model'),
                'max_tokens' => $options['max_tokens'] ?? config('ai.anthropic.max_tokens'),
                'messages'   => [
                    ['role' => 'user', 'content' => $prompt],
                ],
            ]);

            return $response->content[0]->text;

        } catch (ErrorException $e) {
            // Anthropic returns 429 (rate limit) and 529 (overloaded) as ErrorException
            Log::channel('ai')->error('Anthropic API error', [
                'status'  => $e->getCode(),
                'message' => $e->getMessage(),
            ]);

            if (in_array($e->getCode(), [429, 529])) {
                throw new RuntimeException('AI provider rate limit reached. Retry later.', $e->getCode(), $e);
            }

            throw $e;
        }
    }
}

Bind it in AppServiceProvider::register():

use Anthropic\Client;
use App\Services\AI\ClaudeService;

$this->app->singleton(ClaudeService::class, function () {
    $client = \Anthropic::factory()
        ->withApiKey(config('ai.anthropic.api_key'))
        ->make();

    return new ClaudeService($client);
});

This is the foundation. For the governance layer — contracts, telemetry, and cost visibility across providers — Production-Grade AI Architecture in Laravel picks up exactly where this structure leaves off.

[Architect’s Note] The Laravel Way here is a singleton binding in AppServiceProvider, not a Facade, not a helper function. Singletons mean the HTTP client is instantiated once per request cycle, not once per AI call. Under queue workers, that matters — workers process many jobs, and you don’t want to be reconstructing an API client on every job execution.

Queue Workers for Async AI Jobs

Most AI API calls should not run synchronously in a web request. Response times of 3–15 seconds are normal for complex prompts. That’s not a blocking HTTP call you want sitting in your request lifecycle.

Dispatch a job instead:

<?php

namespace App\Jobs;

use App\Services\AI\ClaudeService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use RuntimeException;

class ProcessAIRequest implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $tries   = 3;
    public int $timeout = 60;

    public function __construct(
        private readonly string $prompt,
        private readonly int    $userId
    ) {}

    /**
     * Exponential backoff: wait 30s, 60s, 120s between retries.
     * This matters when retrying against a rate-limited AI provider.
     */
    public function backoff(): array
    {
        return [30, 60, 120];
    }

    public function handle(ClaudeService $claude): void
    {
        try {
            $result = $claude->complete($this->prompt);
            // Store result, notify user, trigger next step
        } catch (RuntimeException $e) {
            // Rate limit hit — let Laravel's retry logic with backoff() handle it
            $this->release(30);
        }
    }
}

Dispatch from your controller:

ProcessAIRequest::dispatch($prompt, auth()->id());

[Production Pitfall] Omitting backoff() means Laravel retries failed jobs immediately, back to back. If the failure reason is a 429 from Anthropic, three immediate retries will all fail — and you’ve now burned your retry budget and potentially triggered a longer cooldown on the API side. Always define backoff() on AI jobs. See the Laravel Queue documentation for the full retry options.

Local Queue Setup with Herd

Run the worker in a terminal tab during development:

php artisan queue:work

Herd doesn’t manage queue workers for you — that’s fine locally. What’s not fine: QUEUE_CONNECTION=sync in your .env. Sync mode hides timing problems, timeout issues, and job serialization bugs that will only surface in production. Run the real driver.

Forge Queue Workers

On Forge, configure a persistent worker per server under Server → Queue Workers:

Setting	Value
Connection	`redis` (strongly preferred)
Queue	`ai-jobs` (named queue for priority isolation)
Timeout	`120`
Maximum Tries	`3`

Forge writes the Supervisor configuration automatically. Workers survive server restarts without intervention.

Use Redis. The database queue driver serialises jobs to a MySQL table and polls it. Under any meaningful AI workload, you will notice the lag and the table growth. Redis is the correct choice here — provision it via Forge → Server → Network.

A Note on Shared Hosting

Persistent queue workers are not available on shared hosting. Full stop. Your host controls the process supervisor and will not run a long-lived Artisan worker for you. Cron-triggered queue runs, serverless workarounds, and third-party queue bridges all introduce latency and reliability problems that compound fast. If your application depends on async AI jobs — and it should — shared hosting is the wrong environment.

Using Tinker for Prompt Development

Tinker is one of the most underused tools in an AI development workflow. It gives you a live REPL against your full service container. You can iterate on prompts and inspect raw responses without writing a route, a controller, or a test.

php artisan tinker

>>> $claude = app(\App\Services\AI\ClaudeService::class);
>>> $claude->complete("Summarise the following in one sentence: Laravel is a PHP web framework.");

You see the raw response. Adjust the prompt. Re-run. Compare. For early prompt iteration this loop is faster than any browser-based workflow — and it gives you a real sense of latency and token usage before you’ve written a single UI component.

Local Database for AI Metadata

AI applications accumulate data that needs persistence: token counts, prompt versions, response caches, job audit trails. Set up a local database that mirrors your production schema from day one.

For Forge deployments, MySQL or PostgreSQL are both standard. Herd ships with both. Don’t develop against SQLite and deploy to MySQL. The query behaviour differences are subtle and will cost you time at the worst moment.

php artisan migrate

If you’re building prompt versioning features — and if you’re serious about AI in production, you will be — Prompt Migrations: Bringing Determinism to AI in Laravel has the full implementation pattern.

Logging AI Responses

Add a dedicated log channel in config/logging.php:

'ai' => [
    'driver' => 'daily',
    'path'   => storage_path('logs/ai.log'),
    'level'  => 'debug',
    'days'   => 7,
],

Use it in your services:

Log::channel('ai')->info('Claude response', [
    'model'       => config('ai.anthropic.model'),
    'prompt_hash' => md5($prompt),
    'tokens_used' => $response->usage->outputTokens,
    'duration_ms' => $durationMs,
]);

Separate AI logs keep them from drowning your application log during heavy development sessions. This same channel config works on Forge without modification.

If you want to enforce per-user token budgets on top of this — and in any multi-tenant application, you should — the Laravel AI Middleware: Token Tracking & Rate Limiting guide builds the full rate-limiting layer that slots directly onto this service architecture.

Forge Deployment Checklist

Before the first deploy to a Forge-managed server:

PHP version parity — Forge → Server → PHP. Match it locally before you push.
Environment variables — set ANTHROPIC_API_KEY, OPENAI_API_KEY, and all AI_* values in Forge’s environment editor. Not in a committed .env file.
Queue worker configured — Server → Queue Workers, timeout at 120, named queue ai-jobs.
Redis installed — Forge → Server → Network. Don’t use the database queue driver for AI workloads under any load.
Deployment script — php artisan config:cache and php artisan queue:restart after every deploy. Without queue:restart, your workers are running against stale config.

Next Steps

The complete service architecture — streaming responses, token accounting, multi-provider contracts, and production telemetry — is covered in The Complete Guide to Integrating Claude API with Laravel. It picks up directly from the stack you’ve just built.

Once your environment is solid, Building Agentic Laravel Apps with Prism PHP is the cleanest path to multi-provider LLM support and tool calling without managing multiple SDK integrations in parallel.

Questions about this setup? Post in the Developer Q&A or reach out directly.

Dewald Hugo

Senior Laravel Developer and AI Architect with 10+ years in the trenches. Dewald writes about building resilient, cost-aware AI integrations and modernizing the Laravel developer workflow for the 2026 ecosystem.