Last reviewed: April 2026

Introduction: The Rewrite Nobody Budgets For

Most teams do not plan to rewrite their AI integration. It happens through accumulation.

A controller starts with one call to the OpenAI SDK. A second feature adds another. Six months later, the codebase has three providers spread across a dozen files, inconsistent error handling, and every response format parsed differently wherever it appears. When the business decides to move to Claude for cost reasons, or Gemini because of a longer context window, the rewrite is not a configuration change. It is an archaeology expedition.

This is the problem a Laravel AI service layer solves. Not by wrapping an SDK in a convenience class. By drawing a hard architectural boundary that forces all provider-specific logic into one place and keeps it out of every other place.

This article is a systems design piece, not a tutorial. We will not walk through SDK setup for any individual provider, those guides exist and are linked contextually below. What this article defines is the abstraction layer that makes OpenAI, Gemini, and Claude interchangeable from your application’s perspective. If you want the broader architectural context before diving into the service layer specifically, start with our production-ready Laravel AI integration guide, which covers provider comparison, cost governance, and system-level decision frameworks.

Why Direct AI SDK Integration Fails at Scale

Direct SDK integration is not a shortcut. It is a deferred cost.

Vendor lock-in at the code level. When your application calls $openai->chat()->create([...]) in ten different places, OpenAI is not just a dependency — it is load-bearing infrastructure embedded in your business logic. Replacing it requires finding every call site, understanding what each one expects, and rewriting output handling individually. That is not a refactor. That is a migration, and it compounds with every feature added before the refactor happens.

Business logic leakage. Services and controllers that call the AI SDK directly accumulate prompt logic, response parsing, and retry handling alongside the actual business logic they are supposed to implement. We have seen this cause real production incidents: a prompt template embedded in a service method, updated for one feature, quietly breaking two others that happened to share the same method.

Testing complexity. Without an abstraction layer, testing any feature that calls an AI provider means mocking HTTP clients, faking SDK internals, or hitting live APIs in your test suite. Mocking a vendor SDK’s concrete classes ties your tests to the SDK’s internal structure. When the SDK updates, and it will, your tests break even if your logic did not change.

Cost tracking and observability gaps. OpenAI, Gemini, and Claude return token usage in different shapes. Without normalisation, building consistent cost attribution or usage logging requires per-provider handling in every feature. The cost of this compounds fast. Our complete Laravel OpenAI integration guide and Claude API integration guide both document provider-specific response structures in detail, but the point here is that normalising those differences should not be each feature’s responsibility.

Inconsistent error handling. Rate limit responses, context window errors, and transient 500s look different across providers. Handling them in-place in every service means your retry logic drifts out of sync as each developer who touches the code makes slightly different assumptions.

None of these failures are catastrophic in isolation. Together, they define an application that is expensive to change, difficult to test, and fragile under operational pressure.

The Case for a Service Layer Abstraction

The service layer pattern is not specific to AI. It is the same separation of concerns that makes payment gateway abstractions, notification channels, and repository patterns maintainable in large Laravel applications. Applied to AI, the principle is this: your business logic should not know or care which AI provider is active. It calls a contract, receives a standardised response, and proceeds.

What this architectural decision delivers:

Provider independence. Switching from gpt-4o to claude-sonnet-4-5 to gemini-2.0-flash becomes a single environment variable change, not an engineering sprint.

Genuine testability. You swap in a fake implementation that returns deterministic output. No HTTP, no API keys, no network latency in your test suite.

Centralised governance. Rate limiting, cost tracking, prompt logging, and retry logic live in one layer. Every provider call passes through it. Observability is guaranteed by structure, not by remembering to add it per feature.

Additive extensibility. A new provider is a new adapter class implementing an existing interface. Nothing in the application changes. No existing code requires modification.

Laravel’s Service Container is designed for exactly this. The binding mechanism, combined with service providers, gives you a clean resolution path from interface to concrete implementation with zero friction.

The Proposed Architecture

The architecture has four components. They are deliberately minimal. Resist the temptation to add complexity until the problem demands it.

The Provider Interface

This is the contract. Every AI provider must implement it. Note that the interface below aligns with the contract established in our site’s broader AI architecture work, specifically the generate signature. Extensions for streaming and embeddings are implemented as separate, optional interfaces (StreamableProviderInterface, EmbeddingProviderInterface) to avoid forcing every adapter to implement capabilities it does not support.

<?php

namespace App\AI\Contracts;

use App\AI\DTOs\AIResponse;

interface AIProviderInterface
{
    /**
     * Generate a text response from the AI provider.
     *
     * @param  string               $prompt
     * @param  array<string, mixed> $options
     */
    public function generate(string $prompt, array $options = []): AIResponse;
}

Keep this contract stable. Every breaking change to this interface is a breaking change to every adapter, every fake, and every mock across the application and test suite. The interface should evolve slowly and deliberately. Refer to the Laravel Service Container documentation for the binding patterns used throughout this section.

The Normalised Response DTO

Before writing any adapter, define the AIResponse DTO. Every provider maps its response into this structure. This is the normalisation layer, and it is what makes the service layer genuinely provider-agnostic.

<?php

namespace App\AI\DTOs;

readonly class AIResponse
{
    public function __construct(
        public string $content,
        public string $provider,
        public int    $inputTokens,
        public int    $outputTokens,
        public array  $raw = [],
    ) {}

    public function totalTokens(): int
    {
        return $this->inputTokens + $this->outputTokens;
    }
}

The readonly modifier enforces immutability. The raw field preserves the original provider response for debugging without polluting the normalised fields. The totalTokens() helper is small but meaningful: it gives cost tracking logic a single, consistent surface to work with regardless of which provider produced the response.

The Provider Adapters

Each adapter implements AIProviderInterface. The implementation handles all provider-specific SDK calls, response mapping, and error handling. None of this leaks outside the adapter class.

<?php

namespace App\AI\Providers;

use App\AI\Contracts\AIProviderInterface;
use App\AI\DTOs\AIResponse;
use App\AI\Exceptions\AIProviderException;
use Illuminate\Support\Facades\Log;

class OpenAIProvider implements AIProviderInterface
{
    public function __construct(private readonly \OpenAI\Client $client) {}

    public function generate(string $prompt, array $options = []): AIResponse
    {
        try {
            $result = $this->client->chat()->create([
                'model'    => $options['model'] ?? config('services.ai.providers.openai.model'),
                'messages' => [['role' => 'user', 'content' => $prompt]],
            ]);

            return new AIResponse(
                content:      $result->choices[0]->message->content,
                provider:     'openai',
                inputTokens:  $result->usage->promptTokens,
                outputTokens: $result->usage->completionTokens,
                raw:          $result->toArray(),
            );
        } catch (\OpenAI\Exceptions\ErrorException $e) {
            Log::error('OpenAI provider error', [
                'message' => $e->getMessage(),
                'prompt'  => substr($prompt, 0, 200),
            ]);

            throw new AIProviderException(
                "OpenAI generation failed: {$e->getMessage()}",
                previous: $e
            );
        }
    }
}

ClaudeProvider and GeminiProvider follow the same structure — same interface, same return type, different SDK internals. The rest of the application never interacts with those internals. Refer to the Anthropic Messages API reference for the exact response shape the Claude adapter maps from.

The Service Resolver

The factory resolves the correct adapter from configuration. In Laravel 11 and 12, this belongs in a dedicated Service Provider, not in a Kernel class, which no longer exists.

<?php

namespace App\Providers;

use App\AI\Contracts\AIProviderInterface;
use App\AI\Providers\ClaudeProvider;
use App\AI\Providers\GeminiProvider;
use App\AI\Providers\OpenAIProvider;
use Illuminate\Support\ServiceProvider;

class AIServiceProvider extends ServiceProvider
{
    public function register(): void
    {
        $this->app->bind(AIProviderInterface::class, function ($app) {
            $provider = config('services.ai.default', 'openai');

            return match ($provider) {
                'openai' => $app->make(OpenAIProvider::class),
                'gemini' => $app->make(GeminiProvider::class),
                'claude' => $app->make(ClaudeProvider::class),
                default  => throw new \InvalidArgumentException(
                    "AI provider [{$provider}] is not configured."
                ),
            };
        });
    }
}

// bootstrap/app.php
->withProviders([
    App\Providers\AIServiceProvider::class,
])

Your business logic now depends exclusively on the interface:

class DocumentSummariser
{
    public function __construct(private readonly AIProviderInterface $ai) {}

    public function summarise(string $text): string
    {
        $response = $this->ai->generate(
            prompt: "Summarise the following document concisely:\n\n{$text}",
        );

        return $response->content;
    }
}

DocumentSummariser does not know and cannot know which provider is active. That is exactly correct. Switch from gpt-4o to claude-sonnet-4-5 to gemini-2.0-flash in .env, and this class continues to work without modification.

Handling Response Normalisation

[Production Pitfall] The most common failure point in otherwise solid service layer implementations is skipping the DTO. Developers implement the interface correctly, write working adapters, and then return raw strings or plain arrays. The normalisation collapses at the seams the first time you need cross-provider token tracking or audit logging.

The AIResponse DTO defined above is the enforcement mechanism. Every adapter is required to map its provider’s response into that structure before returning. No raw arrays escape the adapter boundary.

This matters most for cost observability. Because every provider’s response passes through AIResponse, you have one consistent surface for token usage logging, regardless of which provider produced it. The totalTokens() method gives downstream middleware a single call to instrument. That plugs directly into the architecture covered in our Laravel AI Middleware: Token Tracking and Rate Limiting article, which shows how to attach cost attribution to every AI call without modifying individual adapters.

One practical addition worth considering early: a metadata array on AIResponse for provider-specific fields that do not map cleanly to the standard fields. Things like finish_reason, model version as returned by the API, or safety rating metadata from Gemini. Keep these out of the primary typed fields so the core contract stays stable, but do not discard them.

readonly class AIResponse
{
    public function __construct(
        public string $content,
        public string $provider,
        public int    $inputTokens,
        public int    $outputTokens,
        public array  $raw      = [],
        public array  $metadata = [], // provider-specific extras
    ) {}

    public function totalTokens(): int
    {
        return $this->inputTokens + $this->outputTokens;
    }
}

This is a small addition that prevents you from expanding the primary contract prematurely.

Configuration Strategy

Provider selection belongs in config/services.php, driven by an environment variable:

// config/services.php
'ai' => [
    'default' => env('AI_PROVIDER', 'openai'),

    'providers' => [
        'openai' => [
            'model'   => env('OPENAI_MODEL', 'gpt-4o'),
            'api_key' => env('OPENAI_API_KEY'),
        ],
        'gemini' => [
            'model'   => env('GEMINI_MODEL', 'gemini-2.0-flash'),
            'api_key' => env('GEMINI_API_KEY'),
        ],
        'claude' => [
            'model'   => env('CLAUDE_MODEL', 'claude-sonnet-4-5'),
            'api_key' => env('ANTHROPIC_API_KEY'),
        ],
    ],
],

Switching providers in any environment is now AI_PROVIDER=claude in .env. No code changes. No deployment blocked on a refactor.

Be deliberate about runtime provider switching. Resolving a different provider per request based on user input or feature flags is a legitimate pattern, but it introduces risk that is easy to underestimate. The Service Container binding resolves once per request lifecycle by default. If you need per-request resolution — say, routing premium users to claude-sonnet-4-5 and standard users to gemini-2.0-flash, implement a dedicated AIProviderResolver that accepts a runtime argument and wraps the factory. Do not rebind the container mid-request. We have seen that pattern create subtle state contamination in singleton-scoped dependencies that was extremely difficult to reproduce and debug.

For Gemini-specific configuration, particularly around the GeminiManager and multi-model failover patterns, our Laravel Gemini integration guide covers the provider-level setup in full, including how to handle the laravel/ai SDK’s driver resolution.

Testing Strategy for the AI Service Layer

The testing story is where the architecture pays off most visibly. Without the service layer, every test that touches AI logic is fighting the SDK. With it, the test is fast, deterministic, and completely offline.

The mechanism is a fake implementation:

<?php

namespace Tests\Fakes;

use App\AI\Contracts\AIProviderInterface;
use App\AI\DTOs\AIResponse;

class FakeAIProvider implements AIProviderInterface
{
    public function __construct(
        private readonly string $fixedResponse = 'Controlled test response.'
    ) {}

    public function generate(string $prompt, array $options = []): AIResponse
    {
        return new AIResponse(
            content:      $this->fixedResponse,
            provider:     'fake',
            inputTokens:  10,
            outputTokens: 5,
        );
    }
}

// In TestCase::setUp() or per test class
protected function setUp(): void
{
    parent::setUp();

    $this->app->bind(
        AIProviderInterface::class,
        fn () => new FakeAIProvider('This is a deterministic test response.')
    );
}

Your feature tests now run without any API calls, at full speed, with responses you control entirely. You are testing your business logic, not the provider’s API availability or response variability.

Error path testing matters just as much as the happy path in AI-heavy systems. Providers go down. Rate limits hit. Context windows get exceeded. Create a FailingAIProvider that throws AIProviderException on demand, bind it in specific tests, and verify your application responds correctly at every failure mode. This is test discipline that direct SDK integration makes genuinely difficult to implement cleanly.

The same principle extends to more complex retrieval pipelines. When you introduce vector search and retrieval-augmented generation, the fake provider pattern keeps your embedding pipeline testable in isolation. Our Laravel embeddings and RAG implementation guide covers how to structure and test those retrieval chains with this kind of provider abstraction already in place.

Common Mistakes

These are the patterns we see most frequently when auditing Laravel AI codebases. Worth naming directly.

API calls in controllers. A controller orchestrates HTTP — it handles request validation, delegates to services, and shapes the HTTP response. It does not make AI API calls. If your controller is calling $this->ai->generate(), the business logic belongs one layer down. The controller should not know anything about the AI operation’s mechanics.

Hardcoding provider logic. if ($provider === 'openai') blocks scattered through application code are a maintenance disaster. Each conditional is a place the logic can drift as providers evolve at different rates. This is exactly the problem the interface and factory resolve at the architectural level.

No abstraction layer at all. Some codebases have a single AIService class that wraps an SDK call without an interface or DTO. This is better than nothing, but it makes the binding non-swappable, the response non-normalised, and the test setup significantly harder than it needs to be.

Prompt logic embedded in services. Prompts are configuration, not code. Embedding them in the service or domain class that calls the AI makes them invisible to non-engineers, impossible to version cleanly, and difficult to audit. They belong in dedicated prompt classes or managed prompt templates. The Prompt Migrations article covers a rigorous approach to this.

Ignoring the streaming question during initial design. If your application will ever need streaming responses — and most AI-facing products eventually do — the interface design matters from day one. A synchronous generate contract retrofitted with streaming six months later produces awkward, inconsistent abstractions. Our Laravel AI streaming transport guide covers how Livewire, SSE, and WebSockets each interact with a service layer, and how to design the interface so streaming is additive rather than disruptive.

Conclusion

AI in Laravel is not an API problem. It is an architecture problem.

Teams that treat it as an API problem ship fast and iterate slowly. Every provider change, model upgrade, cost optimisation, or feature addition requires touching code that should not know providers exist. The cognitive overhead accumulates until the codebase becomes resistant to change, exactly the wrong posture for a space that evolves as quickly as AI infrastructure.

The service layer described here is not complex. The interface is a handful of lines. The DTO is five typed properties. The factory is a match statement. The test fake is twenty lines. The complexity it eliminates — rewrite cycles, provider-specific bugs leaking into business logic, untestable AI features — is orders of magnitude larger.

Start with the interface. Implement one adapter. Write the fake for tests. Gemini and Claude are additive later, new adapter classes that implement the existing contract, with nothing in the application requiring modification. That is what correct abstractions feel like: composable, not disruptive.

Build the boundary early. Retrofitting it later always costs more than you estimate, and that estimate is always already too high.

Dewald Hugo

Senior Laravel Developer and AI Architect with 10+ years in the trenches. Dewald writes about building resilient, cost-aware AI integrations and modernizing the Laravel developer workflow for the 2026 ecosystem.