Laravel Queue + OpenAI API: “Request Timeout” on long-running LLM jobs

Question

436 viewsApril 26, 2026

5

Werner Brink April 26, 2026 0 Comments

I’m trying to integrate the OpenAI PHP SDK into my Laravel 11 project to generate long-form SEO blog posts.

The integration works fine for short snippets, but when I send a prompt for a 2,000-word article, the request just hangs and eventually throws a 408 Request Timeout from my Nginx proxy. I know I shouldn’t run this directly in the controller, so I moved it to a Laravel Queue Job, but now I’m hitting a new wall.

My Problem:
Even inside the queue, the job is being marked as “failed” after 60 seconds because of the default retry_after setting. OpenAI sometimes takes 90+ seconds to stream a long response, and the worker thinks the job died and tries to restart it, causing a loop of half-finished API calls.

What I’ve tried:
Increased max_execution_time in php.ini (didn’t help the worker).
Tried using openai-php/client directly instead of the Laravel wrapper.
Set $timeout = 120 in my GenerateArticle job class, but the worker still kills it.

public function handle(): void
{
    // This part takes forever...
    $response = OpenAI::chat()->create([
        'model' => 'gpt-4-turbo',
        'messages' => [['role' => 'user', 'content' => $this->prompt]],
    ]);

    $this->post->update(['content' => $response->choices[0]->message->content]);
}

How do I properly handle these long-running AI tasks without the worker timing out or me hitting a 504 on the frontend? Should I be using Laravel AI SDK streaming or Laravel Reverb to push the content back to the UI piece-by-piece?

Werner Brink Answered question April 26, 2026

3 Answers

You are viewing 1 out of 3 answers, click here to view all answers.

score 2 · Answer 1 · 2026-04-26T21:55:43+00:00

You’re hitting a classic mismatch between how Laravel handles queues and how LLMs behave in the real world. By 2026 standards, GPT-4-Turbo is actually considered “slow” for 2,000-word outputs, so your architectural approach is more important than just bumping a timeout number.

Here is how you fix the loop and handle the UI.

1. The “Worker vs. Connection” Timeout Fix
The reason your $timeout = 120 isn’t working is likely because your Queue Connection retry_after setting in config/queue.php is lower than your Job’s $timeout.

If retry_after is 60 seconds and your job takes 90, the queue manager thinks the worker died and releases the job back to another worker while the first one is still running.

In your Job class, set these explicitly:

public $timeout = 300; // 5 minutes for safety
public $failOnTimeout = true;

public function handle(): void
{
    // Important: Tell the OpenAI client itself not to time out!
    $response = OpenAI::client(config('openai.api_key'))
        ->withHttpHeader('X-My-Header', 'Value')
        ->chat()
        ->create([
            'model' => 'gpt-4-turbo',
            'messages' => [['role' => 'user', 'content' => $this->prompt]],
            'timeout' => 300, // Pass the timeout to the Guzzle/HTTP client
        ]);

    $this->post->update(['content' => $response->choices[0]->message->content]);
}

2. Update your Worker Command
When you run your worker, you need to make sure the global timeout flag allows for these long runs:

php artisan queue:work --timeout=305

3. The “2026 Way”: Streaming + Reverb
Waiting 90 seconds for a page to refresh is a terrible UX. Since you’re on Laravel 11/12+, you should absolutely be using Laravel Reverb (WebSockets) to stream the text to the UI as it’s generated.

Instead of a single $response->create(), use ->stream():

– Job: Dispatches a PartialContentGenerated event every time a chunk of text comes in.

– Event: Broadcasts via Reverb to a private channel for that specific Post ID.

– Frontend: A simple Livewire or Vue component listens for the broadcast and appends the text to the screen in real-time.

Example Job Logic:

$stream = OpenAI::chat()->createStreamed([
    'model' => 'gpt-4-turbo',
    'messages' => [['role' => 'user', 'content' => $this->prompt]],
]);

foreach ($stream as $response) {
    $text = $response->choices[0]->delta->content;
    if ($text) {
        // Broadcast this chunk via Reverb
        PartialContentGenerated::dispatch($this->post, $text);
    }
}

Summary Checklist:
config/queue.php: Set retry_after to at least 360 (must be higher than your job timeout).

Job Class: Set public $timeout = 300.

OpenAI Client: Pass the timeout parameter into the create() method.

UI: Use Reverb. Even if the job takes 2 minutes, the user sees progress immediately, which prevents them from clicking “refresh” and triggering even more API calls.

Have you checked your Nginx proxy_read_timeout? If you eventually move back to a synchronous approach for any reason, that’s usually the culprit for the 408/504 errors.

How are you handling the database connection during that long 90-second wait? If you have a low max_connections, long-running jobs can sometimes hog them.