logo

LLM Streaming

Examples / LLM Streaming

LLM Streaming Simulation

Simulate real-time AI response streaming and measure render performance. Adjust speed, jitter, and chunking to test how SvelteMarkdown handles token-by-token updates from LLMs like ChatGPT, Claude, and Gemini.

Live Metrics

Watch chunk throughput and render cost live while switching between append, concat, and offset patch simulation.

Progress
0%
Last Render
<1ms
Average Render
<1ms
Peak Render
<1ms
Dropped Frames
0

How LLM Streaming Works

  • LLMs stream tokens via SSE. SvelteMarkdown re-parses and re-renders on each update, keeping output in sync.
  • Render times stay under 16ms (one frame budget) for typical LLM speeds of 30-80 tokens/sec.
  • Track token costs across providers with ModelPricing.ai.
  • Building a chat UI? Pair with @humanspeak/svelte-virtual-list for smooth virtual scrolling.

Stream Controls

Configuration

Speed 30 chunks/sec
Jitter 50%
Chunk mode
Stream mode

Markdown Source

2247 chars

Edit or paste your own markdown below. This content will be streamed token-by-token when you click Start.

Rendered Output