When Echo + Alpine Beats Livewire for AI Chat
The real-time AI UX learning path on this site covers every transport available for streaming AI output in Laravel. The transport selection guide establishes the decision criteria clearly: Livewire is the correct default when your interface is primarily server-rendered and Alpine handles minor interactivity. This article covers where that default breaks down.
Livewire owns the component state. Every user action triggers a server round-trip; the component re-renders before the browser sees anything. That model works well for forms, data tables, and dashboards. It becomes a liability when you need a laravel echo alpine js ai chat interface where:
- Conversation history lives in the browser across multiple turns without round-tripping on every submission
- Multiple async streams may be active simultaneously (branching conversations, regeneration)
- The server should push AI output without the client holding an open HTTP response
- Additional bidirectional events (read receipts, presence, per-user channel metadata) are part of the interface design
In that scenario, conversation state belongs in Alpine.js, transport belongs to Laravel Echo over a private Reverb channel, and the HTTP controller’s job is to dispatch work and return. This article builds that full interface, from broadcast events through queue tuning.
The Livewire approach to real-time AI chat is covered separately in full detail. Read both. Choose by interface requirements, not by familiarity.
| Dimension | Livewire | Alpine.js + Echo |
|---|---|---|
| State ownership | Server-side component | Browser (Alpine reactive data) |
| HTTP connection during stream | Held open for stream duration | Released immediately after dispatch |
| PHP-FPM pressure | One worker tied up per active stream | Zero web-tier workers held during streaming |
| Conversation history | Managed in component $history property |
Managed in Alpine messages array |
| Multi-turn without round-trip | No, each turn re-hydrates the component | Yes, history stays in browser |
| Concurrent streams | One per component instance | Multiple, tracked by serverId |
| Bidirectional events | Not supported natively | Native via Echo (presence, read receipts, branching) |
| Stream cancellation | Requires custom SSE sentinel or polling | Cache flag checked per chunk in queue worker |
| Infrastructure requirement | None beyond Laravel | Laravel Reverb + dedicated queue worker |
| Implementation complexity | Low | Medium |
| Best fit | Server-rendered interfaces, single-turn generation, forms | Client-driven chat, concurrent streams, multi-user presence |
Architecture Overview
The request lifecycle decouples the HTTP submission from the streaming response completely. The controller validates the request, generates a message ID, dispatches a queued job, and returns that ID. The job streams tokens from the LLM via Prism PHP and broadcasts each chunk to a private user-scoped Reverb channel. Alpine.js listens on that channel and appends tokens to conversation state as they arrive. A terminal event signals completion or failure. The cancel path writes a Cache flag that the job checks on every chunk iteration.
The token-by-token delivery mechanics of Reverb (connection keep-alive, chunk framing, and authentication handshake) are covered in the Reverb WebSocket token delivery guide. This article focuses on the full interface layer built on top of that transport.
This architecture carries one concrete production advantage over SSE-backed approaches: no HTTP connection is held open. The PHP-FPM worker that accepted the request is free the moment the job is dispatched. The blocking LLM call lives entirely in the queue worker. Teams running SSE at volume hit PHP-FPM worker exhaustion before they hit API rate limits. The WebSocket model pushes that pressure to queue workers, which scale horizontally without touching your web tier.
Backend Setup: Reverb Channel, Broadcasting Events
Define the private channel in routes/channels.php. The authorisation closure enforces user-scoped access: no subscriber can join another user’s stream channel.
<?php
// routes/channels.php
use Illuminate\Support\Facades\Broadcast;
Broadcast::channel('ai-chat.{userId}', function ($user, $userId) {
return (int) $user->id === (int) $userId;
});
Three broadcast events cover the full stream lifecycle. Keep them lean; each event carries only what the client needs to act on it.
<?php
// app/Events/AiTokenReceived.php
namespace App\Events;
use Illuminate\Broadcasting\InteractsWithSockets;
use Illuminate\Broadcasting\PrivateChannel;
use Illuminate\Contracts\Broadcasting\ShouldBroadcast;
use Illuminate\Foundation\Events\Dispatchable;
use Illuminate\Queue\SerializesModels;
class AiTokenReceived implements ShouldBroadcast
{
use Dispatchable, InteractsWithSockets, SerializesModels;
public function __construct(
public readonly int $userId,
public readonly string $messageId,
public readonly string $token,
public readonly int $tokenIndex,
) {}
public function broadcastOn(): array
{
return [new PrivateChannel("ai-chat.{$this->userId}")];
}
public function broadcastAs(): string
{
return 'token.received';
}
}
<?php
// app/Events/AiStreamCompleted.php
namespace App\Events;
use Illuminate\Broadcasting\InteractsWithSockets;
use Illuminate\Broadcasting\PrivateChannel;
use Illuminate\Contracts\Broadcasting\ShouldBroadcast;
use Illuminate\Foundation\Events\Dispatchable;
use Illuminate\Queue\SerializesModels;
class AiStreamCompleted implements ShouldBroadcast
{
use Dispatchable, InteractsWithSockets, SerializesModels;
public function __construct(
public readonly int $userId,
public readonly string $messageId,
public readonly int $totalTokens,
) {}
public function broadcastOn(): array
{
return [new PrivateChannel("ai-chat.{$this->userId}")];
}
public function broadcastAs(): string
{
return 'stream.completed';
}
}
<?php
// app/Events/AiStreamFailed.php
namespace App\Events;
use Illuminate\Broadcasting\InteractsWithSockets;
use Illuminate\Broadcasting\PrivateChannel;
use Illuminate\Contracts\Broadcasting\ShouldBroadcast;
use Illuminate\Foundation\Events\Dispatchable;
use Illuminate\Queue\SerializesModels;
class AiStreamFailed implements ShouldBroadcast
{
use Dispatchable, InteractsWithSockets, SerializesModels;
public function __construct(
public readonly int $userId,
public readonly string $messageId,
public readonly string $error,
) {}
public function broadcastOn(): array
{
return [new PrivateChannel("ai-chat.{$this->userId}")];
}
public function broadcastAs(): string
{
return 'stream.failed';
}
}
[Architect’s Note] Broadcasting events dispatch on the default queue connection unless you configure otherwise. If your
ai-streamingqueue uses a dedicated Redis connection, route broadcast events to the same connection inconfig/broadcasting.php. Token delivery has timing sensitivity: a shared default queue under load introduces visible latency between chunks, and users perceive that as jank rather than a queue configuration issue.
The Streaming Job
The job owns the blocking LLM call and all broadcast dispatch. Two corrections from the brief apply here. First, Prism PHP’s streaming terminator is asStream(), not stream(). Second, withMessages() requires typed UserMessage and AssistantMessage objects, not plain arrays. Passing raw arrays will throw a type error at runtime.
<?php
// app/Jobs/StreamAiResponseJob.php
namespace App\Jobs;
use App\Events\AiStreamCompleted;
use App\Events\AiStreamFailed;
use App\Events\AiTokenReceived;
use App\Models\AgentConversation;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Facades\Log;
use Prism\Prism\Enums\Provider;
use Prism\Prism\Facades\Prism;
use Prism\Prism\ValueObjects\Messages\AssistantMessage;
use Prism\Prism\ValueObjects\Messages\UserMessage;
class StreamAiResponseJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public int $tries = 2;
public int $timeout = 120;
public function __construct(
public readonly int $userId,
public readonly string $messageId,
public readonly string $userMessage,
public readonly array $conversationHistory,
) {}
public function handle(): void
{
$messages = collect($this->conversationHistory)
->map(fn (array $msg) => match ($msg['role']) {
'user' => new UserMessage($msg['content']),
'assistant' => new AssistantMessage($msg['content']),
default => throw new \InvalidArgumentException(
"Unexpected role: {$msg['role']}"
),
})
->all();
try {
$stream = Prism::text()
->using(Provider::Anthropic, 'claude-sonnet-4-6')
->withSystemPrompt('You are a helpful assistant.')
->withMessages($messages)
->withPrompt($this->userMessage)
->withMaxTokens(4096)
->asStream();
$tokenIndex = 0;
$fullContent = '';
foreach ($stream as $chunk) {
if (Cache::get("stream_cancel:{$this->messageId}")) {
Cache::forget("stream_cancel:{$this->messageId}");
return;
}
$token = $chunk->text ?? '';
if ($token === '') {
continue;
}
$fullContent .= $token;
AiTokenReceived::dispatch(
$this->userId,
$this->messageId,
$token,
$tokenIndex++,
);
}
AiStreamCompleted::dispatch(
$this->userId,
$this->messageId,
$tokenIndex,
);
AgentConversation::create([
'user_id' => $this->userId,
'session_id' => $this->messageId,
'role' => 'assistant',
'content' => $fullContent,
]);
} catch (\Throwable $e) {
AiStreamFailed::dispatch(
$this->userId,
$this->messageId,
'Response unavailable. Please try again.',
);
Log::error('AI stream job failed', [
'user_id' => $this->userId,
'message_id' => $this->messageId,
'error' => $e->getMessage(),
]);
}
}
public function failed(\Throwable $e): void
{
AiStreamFailed::dispatch(
$this->userId,
$this->messageId,
'Response unavailable. Please try again.',
);
}
}
[Production Pitfall] Setting
$triesabove 2 creates a duplicate broadcast problem that is painful to isolate. A job that streams 200 tokens before failing and then retries will re-broadcast those 200 tokens, producing garbled output in the client. The retry does not know where the previous attempt left off. Two tries is the ceiling for streaming jobs. ThetokenIndexgap detection in the Alpine component (shown below) lets you identify Reverb delivery gaps independently of job retries, so you are not conflating two different failure modes.
The Controller: Submit and Cancel
The controller has one job: validate, dispatch, and return the message ID. All streaming logic lives in the job.
<?php
// app/Http/Controllers/AiChatController.php
namespace App\Http\Controllers;
use App\Jobs\StreamAiResponseJob;
use Illuminate\Http\JsonResponse;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Str;
class AiChatController extends Controller
{
public function send(Request $request): JsonResponse
{
$validated = $request->validate([
'message' => ['required', 'string', 'max:2000'],
'history' => ['array', 'max:40'],
'history.*.role' => ['required', 'in:user,assistant'],
'history.*.content' => ['required', 'string', 'max:4000'],
]);
$messageId = (string) Str::uuid();
StreamAiResponseJob::dispatch(
userId: $request->user()->id,
messageId: $messageId,
userMessage: $validated['message'],
conversationHistory: $validated['history'] ?? [],
)->onQueue('ai-streaming');
return response()->json(['message_id' => $messageId]);
}
public function cancel(Request $request, string $messageId): JsonResponse
{
Cache::put("stream_cancel:{$messageId}", true, now()->addMinutes(5));
return response()->json(['cancelled' => true]);
}
}
<?php
// routes/api.php
use App\Http\Controllers\AiChatController;
use Illuminate\Support\Facades\Route;
Route::middleware('auth:sanctum')->group(function () {
Route::post('/chat/send', [AiChatController::class, 'send']);
Route::post('/chat/{messageId}/cancel', [AiChatController::class, 'cancel']);
});
[Architect’s Note] Conversation history arrives from the Alpine component to avoid a server round-trip on every submission. That is a deliberate tradeoff, not the only option. The validation rules constrain role and length but cannot verify the authenticity of the history payload. For applications where conversation integrity matters (moderated content, legal records, billing audit trails), reload history server-side from your Eloquent models inside the job’s
handle()method rather than trusting the client-supplied array.
The Alpine.js Component: Full Stream State Management
The Alpine component manages conversation state, the Echo subscription, and all stream lifecycle transitions. Three logic errors from the brief are corrected here.
The most consequential: history must be snapshotted before the new user message is pushed to this.messages. The brief built history after pushing, which included the current user turn in the history array and then passed the same turn as message, duplicating it in the LLM context. The model would see the user’s message twice on every request after the first.
The second: each message uses a stable clientId (generated via crypto.randomUUID()) as the x-for key, alongside a separate serverId assigned when the server responds. If you use serverId as the key and it starts as null, Alpine assigns identical keys to every pending assistant message, and when the ID resolves, the DOM re-creates the element from scratch, losing any partially streamed content.
The third: the original @destroy.window custom event listener is not a standard Alpine.js lifecycle pattern and will not fire reliably during page navigation. The beforeunload event registered in init() is the correct mechanism for disconnecting the Echo subscription when the user leaves.
function aiChat(userId) {
return {
messages: [],
input: '',
activeStreams: {},
echo: null,
init() {
this.echo = new Echo({
broadcaster: 'reverb',
key: import.meta.env.VITE_REVERB_APP_KEY,
wsHost: import.meta.env.VITE_REVERB_HOST,
wsPort: import.meta.env.VITE_REVERB_PORT,
forceTLS: false,
enabledTransports: ['ws', 'wss'],
});
this.echo
.private(`ai-chat.${userId}`)
.listen('.token.received', (e) => this.handleToken(e))
.listen('.stream.completed', (e) => this.handleComplete(e))
.listen('.stream.failed', (e) => this.handleFailed(e));
window.addEventListener('beforeunload', () => this.destroy());
},
async send() {
const text = this.input.trim();
if (!text || this.hasActiveStream()) return;
this.input = '';
// Snapshot history BEFORE pushing the new user message.
// Including the current turn here duplicates it in the LLM context
// because it is also passed as `message` in the request body.
const history = this.messages
.filter(m => m.role !== 'assistant' || m.state === 'complete')
.map(m => ({ role: m.role, content: m.content }))
.slice(-20);
this.messages.push({
role: 'user',
content: text,
clientId: crypto.randomUUID(),
serverId: null,
state: 'complete',
});
const assistantMessage = {
role: 'assistant',
content: '',
clientId: crypto.randomUUID(), // stable x-for key
serverId: null, // assigned once server responds
state: 'thinking',
};
this.messages.push(assistantMessage);
try {
const response = await fetch('/api/chat/send', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-CSRF-TOKEN': document.querySelector(
'meta[name="csrf-token"]'
).content,
},
body: JSON.stringify({ message: text, history }),
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}`);
}
const { message_id } = await response.json();
assistantMessage.serverId = message_id;
this.activeStreams[message_id] = true;
} catch {
assistantMessage.state = 'failed';
assistantMessage.content = 'Failed to send. Please try again.';
}
},
handleToken(e) {
const msg = this.messages.find(m => m.serverId === e.messageId);
if (!msg) return;
// Gap detection: non-contiguous indices indicate dropped chunks.
if (
msg._lastTokenIndex !== undefined &&
e.tokenIndex !== msg._lastTokenIndex + 1
) {
console.warn(
`[ai-chat] Token gap on ${e.messageId}: ` +
`expected ${msg._lastTokenIndex + 1}, received ${e.tokenIndex}`
);
}
msg._lastTokenIndex = e.tokenIndex;
msg.state = 'streaming';
msg.content += e.token;
},
handleComplete(e) {
const msg = this.messages.find(m => m.serverId === e.messageId);
if (!msg) return;
msg.state = 'complete';
delete this.activeStreams[e.messageId];
},
handleFailed(e) {
const msg = this.messages.find(m => m.serverId === e.messageId);
if (!msg) return;
msg.state = 'failed';
msg.content = e.error;
delete this.activeStreams[e.messageId];
},
async cancel(messageId) {
await fetch(`/api/chat/${messageId}/cancel`, {
method: 'POST',
headers: {
'X-CSRF-TOKEN': document.querySelector(
'meta[name="csrf-token"]'
).content,
},
});
const msg = this.messages.find(m => m.serverId === messageId);
if (msg) {
msg.state = 'cancelled';
delete this.activeStreams[messageId];
}
},
hasActiveStream() {
return Object.keys(this.activeStreams).length > 0;
},
destroy() {
if (this.echo) {
this.echo.disconnect();
}
},
};
}
[Edge Case Alert] Echo channel subscription must be confirmed before the send button accepts input, or early tokens from a fast LLM response will arrive before the listener is registered and be silently dropped. This failure mode appears specifically under fast network conditions where the WebSocket handshake and the LLM’s first token arrive within milliseconds of each other. Gate the send button on Echo’s
subscribedcallback rather than on component initialisation alone:.listen('.token.received', ...)exposes a.subscribed(() => { this.connected = true; })chainable callback.
[Efficiency Gain] The
_lastTokenIndexgap detection logs dropped chunks before users report them. Consistent gaps at the same index across multiple sessions point to a queue worker memory boundary or timeout issue at a specific chunk position, not random network loss. Pipe these warnings to your error tracking platform alongside themessage_idso you can correlate with Horizon job logs.
The Blade Template
The x-for directive uses clientId as the stable key so Alpine does not re-create message elements when serverId is assigned after the server responds.
<div x-data="aiChat({{ auth()->id() }})"
x-init="init()">
<div id="message-list">
<template x-for="message in messages" :key="message.clientId">
<div :class="'message message--' + message.role">
<div x-show="message.state === 'thinking'"
class="typing-indicator"
aria-label="Waiting for response">
<span></span><span></span><span></span>
</div>
<div x-show="message.state !== 'thinking'"
x-text="message.content"
:class="{
'message--streaming': message.state === 'streaming',
'message--failed': message.state === 'failed',
'message--cancelled': message.state === 'cancelled',
}">
</div>
<button x-show="message.state === 'streaming' && message.serverId"
@click="cancel(message.serverId)"
class="cancel-btn"
type="button">
Stop
</button>
</div>
</template>
</div>
<div class="input-area">
<textarea x-model="input"
@keydown.enter.prevent="send()"
:disabled="hasActiveStream()"
placeholder="Type a message..."
rows="1">
</textarea>
<button @click="send()"
:disabled="hasActiveStream() || !input.trim()"
class="send-btn"
type="button">
Send
</button>
</div>
</div>
The typing indicator CSS and the Redis-backed cancellation pattern used here are covered in depth in the streaming UX states guide, which also documents elapsed-time surface and token count display during active streams. Both compose cleanly with the state field on each message object in this component.
Queue Configuration for Streaming Workloads
The ai-streaming queue needs a dedicated Supervisor block. Three settings have direct impact on stream reliability.
--timeout must exceed the longest expected stream duration plus a buffer. A 120-second job timeout with a 130-second Supervisor timeout is a reasonable starting point. Tighten it once you have production latency data, leaving unnecessary headroom delays failure detection when the LLM API hangs rather than errors.
--tries on the Supervisor process is independent of $tries on the job class. Set both to 2. A failed streaming job retrying on a degraded LLM API connection will stack blocking Guzzle connections rather than releasing them, accumulating TCP hold time that manifests as cascading latency across the queue.
--max-jobs controls worker longevity. Streaming jobs hold open HTTP connections and accumulate buffered tokens in memory over their lifetime. In practice at sustained load, workers processing long-context streams show gradual heap growth that does not fully release between jobs. Setting --max-jobs=50 restarts the worker after a batch, preventing that growth from reaching OOM territory.
<?php
// config/horizon.php (excerpt)
'environments' => [
'production' => [
'ai-streaming-supervisor' => [
'connection' => 'redis',
'queue' => ['ai-streaming'],
'balance' => 'auto',
'processes' => 8,
'tries' => 2,
'timeout' => 130,
'maxJobs' => 50,
],
],
],
For the full Horizon configuration covering auto-scaling thresholds, memory limits, and queue priority across mixed workloads, the Laravel Horizon production guide covers the complete setup in detail.
The Cache-flag cancel mechanism in StreamAiResponseJob is the same pattern used in building a human-in-the-loop approval workflow for agentic pipelines. That article covers approval gate state management for multi-step agents, which extends naturally from the stream cancellation model shown here when you need user intervention mid-generation rather than a hard stop.
Choosing the Right Tool
At the start of this article, we drew a line: Livewire owns the server-rendered model; Echo and Alpine own the client-driven one. Having built the full interface, that line should feel concrete rather than theoretical.
The implementation above manages 5 distinct message states in the browser, tracks multiple stream IDs simultaneously, and reconnects gracefully across page navigation. None of which require a server round-trip. The controller is 20 lines. The queue worker handles all blocking I/O. The HTTP layer is free on every request. That is the architecture you reach for when the interface genuinely needs it.
Livewire is not the wrong choice by default. It is the right choice when the interface is form-driven, the component state is primarily server-owned, and the streaming concerns are isolated to a single response surface. The Livewire real-time chat implementation is shorter, simpler to test, and carries no WebSocket infrastructure requirement. For most AI-assisted Laravel features (autocomplete, inline suggestions, single-turn generation), it is the correct default.
The Echo and Alpine pattern earns its complexity when you need per-message stream state, cancellation, gap detection, and the flexibility to add presence or bidirectional events later without rebuilding the transport layer. If you find yourself fighting Livewire’s component lifecycle to manage those concerns, that friction is the signal. Switch there, not before.
Frequently Asked Questions
Why use WebSockets via Reverb rather than SSE for this pattern?
SSE is a one-way server-to-client channel over a persistent HTTP connection. That connection ties up a PHP-FPM worker for the entire stream duration. Reverb over WebSocket moves the blocking call into a queue worker, freeing your web tier. The tradeoff is operational complexity: you need Reverb running and a queue worker configured for the streaming workload.
What happens if the user closes the tab mid-stream?
The Echo connection disconnects, but the queue job continues running until it completes or the Cache cancel flag is set. The completed response is persisted to AgentConversation regardless. On reconnect, history reloads from the Alpine state (or from the database if you reload server-side), and the stream result is available.
How do I prevent a user from submitting while a stream is active?
The hasActiveStream() check on the send button and textarea disables both during an active stream. This also prevents the conversation history from being snapshotted mid-stream, which would include an incomplete assistant message in the next request’s context.
Can I support concurrent streams to the same user?
Yes. The activeStreams object tracks multiple message IDs simultaneously. Remove the hasActiveStream() guard on the send button and handle each stream independently. The serverId lookup in handleToken routes each broadcast event to the correct message regardless of concurrent streams.
What is the correct tries setting for the streaming job?
Two. Higher values cause duplicate token broadcasts on retry, producing garbled output. The failed() hook on the job class dispatches the error event after the final attempt is exhausted, so the client always receives a terminal state.
A software architect with 15+ years of experience in the PHP and Laravel ecosystem. Dewald created Origin Main to provide the engineering rigour required to integrate AI into professional, high-concurrency production systems. He writes for developers who care less about "getting it to work" and more about "getting it to last".

