> ## Documentation Index
> Fetch the complete documentation index at: https://docs.elumenta.ru/llms.txt
> Use this file to discover all available pages before exploring further.

# Text-to-Speech

> Convert text to natural-sounding audio with ElevenLabs, OpenAI TTS, and MiniMax

## Request

<ParamField header="Authorization" type="string" required>
  `Authorization: Bearer nb_YOUR_API_KEY`
</ParamField>

<ParamField body="model" type="string" required>
  TTS model slug:

  * `minimax-tts` — Free, Chinese/English, very natural
  * `openai-tts` — OpenAI standard voices
  * `openai-tts-hd` — OpenAI HD quality voices
  * `gpt-4o-mini-tts` — GPT-4o Mini TTS
  * `elevenlabs-flash` — ElevenLabs fast (low latency)
  * `elevenlabs-v2` — ElevenLabs Multilingual v2 (highest quality)
</ParamField>

<ParamField body="text" type="string" required>
  The text to synthesize. Maximum length depends on model (typically 5,000 characters).
</ParamField>

<ParamField body="voice_id" type="string">
  Voice identifier. Available voices depend on the model. See examples below.
</ParamField>

<ParamField body="speed" type="number" default="1.0">
  Speaking speed multiplier. Range: `0.5` – `2.0`.
</ParamField>

<ParamField body="format" type="string" default="mp3">
  Output audio format: `mp3`, `wav`, `ogg`
</ParamField>

## Available Voices

### OpenAI TTS

`alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`

### ElevenLabs

ElevenLabs supports hundreds of voices. Use common ones like `rachel`, `adam`, `bella`, `josh` or pass any ElevenLabs voice ID directly.

### MiniMax TTS

`male-qn-qingse`, `male-qn-jingying`, `female-shaonv`, `female-yujie` (and more)

## Request Example

<CodeGroup>
  ```bash ElevenLabs theme={null}
  curl -X POST https://elumenta.ru/api/v2/generate \
    -H "Authorization: Bearer nb_YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "elevenlabs-v2",
      "text": "Welcome to Elumenta. Your AI-powered creative studio.",
      "voice_id": "rachel",
      "speed": 1.0,
      "format": "mp3"
    }'
  ```

  ```bash OpenAI TTS HD theme={null}
  curl -X POST https://elumenta.ru/api/v2/generate \
    -H "Authorization: Bearer nb_YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai-tts-hd",
      "text": "Generating voices with OpenAI TTS HD.",
      "voice_id": "nova",
      "format": "mp3"
    }'
  ```

  ```python Python theme={null}
  import requests

  res = requests.post(
      "https://elumenta.ru/api/v2/generate",
      headers={"Authorization": "Bearer nb_YOUR_API_KEY"},
      json={
          "model": "elevenlabs-v2",
          "text": "Hello, this is a test of the Elumenta TTS API.",
          "voice_id": "rachel",
          "format": "mp3"
      }
  )

  data = res.json()
  print(f"Audio URL: {data['url']}")
  print(f"Duration: {data['duration_seconds']}s")
  ```
</CodeGroup>

## Response

```json theme={null}
{
  "id": "gen_01j9x2tts001",
  "status": "completed",
  "model": "elevenlabs-v2",
  "url": "https://storage.elumenta.ru/generations/gen_01j9x2tts001.mp3",
  "duration_seconds": 3.4,
  "characters": 54,
  "tokens_used": 5,
  "balance_remaining": 284,
  "created_at": "2026-03-08T12:10:00Z"
}
```

***

# Speech-to-Text

<api>POST /api/v2/generate</api>

Transcribe audio files to text using Whisper, GPT-4o Transcribe, or ElevenLabs Scribe.

## Request

<ParamField body="model" type="string" required>
  STT model:

  * `whisper` — Fast, multilingual, free
  * `gpt-4o-transcribe` — Highest accuracy
  * `elevenlabs-scribe` — Best for podcasts and meetings (diarization support)
</ParamField>

<ParamField body="audio_url" type="string" required>
  URL to the audio file (MP3, WAV, OGG, M4A, FLAC). Max 25MB.
</ParamField>

<ParamField body="language" type="string">
  ISO 639-1 language code (e.g. `en`, `ru`, `de`). If omitted, the model auto-detects.
</ParamField>

<ParamField body="diarize" type="boolean" default="false">
  Speaker diarization (who said what). Only available with `elevenlabs-scribe`.
</ParamField>

## Request Example

```bash theme={null}
curl -X POST https://elumenta.ru/api/v2/generate \
  -H "Authorization: Bearer nb_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "whisper",
    "audio_url": "https://example.com/interview.mp3",
    "language": "en"
  }'
```

## Response

```json theme={null}
{
  "id": "gen_01j9x2stt001",
  "status": "completed",
  "model": "whisper",
  "text": "Hello and welcome to today's episode...",
  "language": "en",
  "duration_seconds": 142.5,
  "tokens_used": 0,
  "created_at": "2026-03-08T12:15:00Z"
}
```
