Laravel Queue + OpenAI API: “Request Timeout” on long-running LLM jobs

Question

47 views12 hours ago

2

Werner Brink 19 hours ago 0 Comments

I’m trying to integrate the OpenAI PHP SDK into my Laravel 11 project to generate long-form SEO blog posts.

The integration works fine for short snippets, but when I send a prompt for a 2,000-word article, the request just hangs and eventually throws a 408 Request Timeout from my Nginx proxy. I know I shouldn’t run this directly in the controller, so I moved it to a Laravel Queue Job, but now I’m hitting a new wall.

My Problem:
Even inside the queue, the job is being marked as “failed” after 60 seconds because of the default retry_after setting. OpenAI sometimes takes 90+ seconds to stream a long response, and the worker thinks the job died and tries to restart it, causing a loop of half-finished API calls.

What I’ve tried:
Increased max_execution_time in php.ini (didn’t help the worker).
Tried using openai-php/client directly instead of the Laravel wrapper.
Set $timeout = 120 in my GenerateArticle job class, but the worker still kills it.

public function handle(): void
{
    // This part takes forever...
    $response = OpenAI::chat()->create([
        'model' => 'gpt-4-turbo',
        'messages' => [['role' => 'user', 'content' => $this->prompt]],
    ]);

    $this->post->update(['content' => $response->choices[0]->message->content]);
}

How do I properly handle these long-running AI tasks without the worker timing out or me hitting a 504 on the frontend? Should I be using Laravel AI SDK streaming or Laravel Reverb to push the content back to the UI piece-by-piece?

Werner Brink Answered question 13 hours ago

3 Answers

You are viewing 1 out of 3 answers, click here to view all answers.

score 2 · Answer 1 · 2026-04-26T22:07:54+00:00

The first answer is correct regarding the retry_after vs. $timeout mismatch, but there is a deeper architectural “gotcha” here that will cause your workers to crash even if you bump those numbers to 10 minutes.

The “Zombie Connection” Problem
When you run a synchronous OpenAI request that takes 90+ seconds, your PHP worker is sitting idle waiting for an I/O response. In many default configurations:

1. The Database Connection times out: If your wait_timeout in MySQL is low, the worker will lose its connection while waiting for OpenAI. When it finally gets the response and tries to $post->update(), you’ll get a General error: 2006 MySQL server has gone away.

2. The “Zombie” Job: If the worker is killed by an external process (like an OOM killer or a deployment script), and your retry_after is set too high, that job stays in “reserved” limbo and won’t be retried for a very long time.

The “Best Practice” Architecture for 2026
Stop trying to make a single Job wait for the whole article. Instead, use a Job Chain or Batch approach:

1. Dispatcher Job: This job calls OpenAI, but you should use Streaming (as mentioned in the first answer).

2. The “Chunk” Pattern: Instead of one massive prompt, break the SEO article into sections (Intro, Body, Conclusion).

– Dispatch a Batch of jobs.

– Each job handles one section.

– This keeps each job under 15–20 seconds, well within the “safe” zone for default Laravel workers.

3. The “Status” Table: Instead of hoping the job finishes, create a generation_tasks table. The UI should poll this (or use Reverb) to show a progress bar (e.g., “Intro generated…”, “Images being researched…”).

A Note on the Laravel AI SDK
If you are already on Laravel 11/12, you should be using the first-party laravel/ai SDK (or openai-php/laravel). It handles the streaming interface much more cleanly than raw Guzzle calls:

// In your Job
public function handle(): void
{
    // The SDK handles the timeout and streaming overhead
    $result = AI::withTimeout(300)->stream('gpt-4-turbo', $this->prompt);

    foreach ($result as $chunk) {
        // Update the cache or broadcast via Reverb
        Cache::put("post_{$this->id}_content", $chunk, 600);
        PostContentUpdated::dispatch($this->post, $chunk);
    }
}

TL;DR: Bumping timeouts is a “band-aid.” For 2,000-word articles, stream the response and decouple your UI from the Job completion via Reverb.