Skip to main content

Overview

NeuralBox supports streaming for text generation via Server-Sent Events (SSE). Instead of waiting for the full response, tokens are delivered as they’re generated. Enable streaming by setting "stream": true in your request.

Basic Example

import requests

def stream_text(prompt: str, model: str = "gpt-5"):
    response = requests.post(
        "https://neuralbox.top/api/v2/generate",
        headers={
            "Authorization": "Bearer nb_YOUR_API_KEY",
            "Accept": "text/event-stream"
        },
        json={
            "model": model,
            "messages": [{"role": "user", "content": prompt}],
            "stream": True
        },
        stream=True
    )
    
    full_text = ""
    for line in response.iter_lines():
        if line:
            line = line.decode("utf-8")
            if line.startswith("data: "):
                data = line[6:]
                if data == "[DONE]":
                    break
                import json
                chunk = json.loads(data)
                if chunk.get("type") == "content_delta":
                    token = chunk["delta"]
                    full_text += token
                    print(token, end="", flush=True)
    
    print()  # newline at end
    return full_text

stream_text("Write a short story about a robot learning to paint")

SSE Event Format

Streaming responses are sent as SSE events:
data: {"type": "generation_start", "id": "gen_01j9x2abc123", "model": "gpt-5"}

data: {"type": "content_delta", "delta": "Here"}

data: {"type": "content_delta", "delta": " are"}

data: {"type": "content_delta", "delta": " the"}

data: {"type": "content_delta", "delta": " numbers"}

data: {"type": "generation_end", "id": "gen_01j9x2abc123", "tokens_used": 3, "balance_remaining": 297}

data: [DONE]

Event Types

EventDescription
generation_startGeneration has begun, includes id and model
content_deltaA token chunk, check delta field
generation_endGeneration complete, includes billing info
errorSomething went wrong, includes message

React Integration Example

import { useState } from "react";

function StreamingChat() {
  const [output, setOutput] = useState("");
  const [loading, setLoading] = useState(false);

  async function generate(prompt) {
    setLoading(true);
    setOutput("");

    const response = await fetch("/api/chat", {  // your proxy endpoint
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ prompt })
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const lines = decoder.decode(value).split("\n");
      for (const line of lines) {
        if (line.startsWith("data: ") && line !== "data: [DONE]") {
          const chunk = JSON.parse(line.slice(6));
          if (chunk.type === "content_delta") {
            setOutput(prev => prev + chunk.delta);
          }
        }
      }
    }

    setLoading(false);
  }

  return (
    <div>
      <button onClick={() => generate("Hello!")}>Generate</button>
      {loading && <span>Generating...</span>}
      <pre>{output}</pre>
    </div>
  );
}

Which Models Support Streaming?

All text models support streaming. Image, video, and audio models do not — they return a URL when complete.