1

I have a Laravel 13 application where I recently implemented the new laravel/mcp package to expose internal tools and database schemas to our enterprise AI clients via HTTP transport.

In development using php artisan mcp:serve --transport=stdio inside Cursor and Claude Desktop, everything works flawlessly. However, after moving to our staging environment—which utilizes the HTTP transport routed via routes/ai.php and runs on a traditional Nginx + PHP-FPM pool—we are running into severe performance degradation and frequent 504 Gateway Time-out errors when multiple agents invoke complex tools concurrently.

Our setup in routes/ai.php:

use App\Mcp\Servers\EnterpriseContextServer;
use Laravel\Mcp\Facades\Mcp;

Mcp::web('/mcp/v1', EnterpriseContextServer::class)
   ->middleware(['auth:api', 'throttle:60,1']);

One of our primary tools handles structured vector lookups and runs complex Eloquent operations. When an LLM client hits the /mcp/v1 endpoint sequentially, it responds fine. But if 3 or 4 users are interacting with the client simultaneously, the PHP-FPM workers max out, response times spike past 30 seconds, and Nginx cuts the connection.

I noticed the documentation mentions a “Dedicated HTTP Server using a high-performance ReactPHP/Octane loop” as an alternative transport layer, but I can’t find a clear implementation layout for handling HTTP-based production traffic under high concurrency.

1. How should I safely decouple long-running tool execution or LLM streaming context from the synchronous HTTP request lifecycle within laravel/mcp?

2. Is it better to migrate the server to a standalone stateful transport layer (like ReactPHP or Laravel Octane), and if so, how do you handle persistent JSON-RPC session states or garbage collection across multiple workers?

Kimberly Powell Asked question