> ## Documentation Index
> Fetch the complete documentation index at: https://docs.elumenta.ru/llms.txt
> Use this file to discover all available pages before exploring further.

# Speech-to-Text

> Transcribe audio files using Whisper, GPT-4o Transcribe, or ElevenLabs Scribe

## Multipart Request

```bash theme={null}
curl -X POST https://elumenta.ru/api/v2/stt \
  -H "Authorization: Bearer nb_YOUR_API_KEY" \
  -F "audio=@recording.mp3" \
  -F "model=whisper" \
  -F "language=en"
```

## Models

| Slug                | Notes                                   | Cost     |
| ------------------- | --------------------------------------- | -------- |
| `whisper`           | Fast, multilingual, 99 languages        | 2 tokens |
| `gpt-4o-transcribe` | Highest accuracy                        | 2 tokens |
| `elevenlabs-scribe` | Best for meetings, supports diarization | 2 tokens |

## Response

```json theme={null}
{
  "id": 18510,
  "status": "completed",
  "result_text": "Hello and welcome to today's episode...",
  "tokens_spent": 0,
  "processing_ms": 3420
}
```

## Diarization (who said what)

Available with `elevenlabs-scribe`:

```bash theme={null}
curl -X POST https://elumenta.ru/api/v2/stt \
  -H "Authorization: Bearer nb_YOUR_API_KEY" \
  -F "audio=@meeting.mp3" \
  -F "model=elevenlabs-scribe" \
  -F "diarize=true"
```

Response includes speaker labels: `[Speaker 1]: Hello... [Speaker 2]: Hi there...`
