Skip to main content
POST
/
api
/
v2
/
stt
Speech-to-Text
curl --request POST \
  --url https://api.example.com/api/v2/stt

Multipart Request

curl -X POST https://neuralbox.top/api/v2/stt \
  -H "Authorization: Bearer nb_YOUR_API_KEY" \
  -F "audio=@recording.mp3" \
  -F "model=whisper" \
  -F "language=en"

Models

SlugNotesCost
whisperFast, multilingual, 99 languages2 tokens
gpt-4o-transcribeHighest accuracy2 tokens
elevenlabs-scribeBest for meetings, supports diarization2 tokens

Response

{
  "id": 18510,
  "status": "completed",
  "result_text": "Hello and welcome to today's episode...",
  "tokens_spent": 0,
  "processing_ms": 3420
}

Diarization (who said what)

Available with elevenlabs-scribe:
curl -X POST https://neuralbox.top/api/v2/stt \
  -H "Authorization: Bearer nb_YOUR_API_KEY" \
  -F "audio=@meeting.mp3" \
  -F "model=elevenlabs-scribe" \
  -F "diarize=true"
Response includes speaker labels: [Speaker 1]: Hello... [Speaker 2]: Hi there...