CommunityBlogHelp CenterAPI PricingSign up
DocsAgents PlatformAPI reference
DocsAgents PlatformAPI reference
  • GET STARTED
    • Overview
    • Quickstart
    • Models
    • Changelog
  • CAPABILITIES
    • Text to Speech
    • Speech to Text
    • Text to Dialogue
    • Voice Changer
    • Voice Isolator
    • Dubbing
    • Sound Effects
    • Voices
    • Voice Remixing
    • Forced Alignment
    • Music
    • Voice Agents
  • DEVELOPER GUIDES
    • Libraries & SDKs
    • WebSockets
    • Error messages
    • Example projects
    • Next.js template
    • Showcase
  • BEST PRACTICES
    • Latency optimization
    • Security
    • Breaking changes policy
  • PRODUCT GUIDES
    • Overview
  • PRIVATE DEPLOYMENT
    • Overview
  • SERVICES
  • RESOURCES
    • Troubleshooting
    • Zero Retention Mode
LogoLogo
Login
Login
CommunityBlogHelp CenterAPI PricingSign up
On this page
  • Overview
  • Creating a transcript
  • Transcript Editor
  • FAQ
  • Supported languages
  • Renaming speakers
PRODUCT GUIDESPlayground

A guide on how to transcribe audio with ElevenLabs
Text to Speech product feature

Overview

With speech to text, you can transcribe spoken audio into text with state of the art accuracy. With automatic language detection, you can transcribe audio in a multitude of languages.

Creating a transcript

1

Upload audio

In the ElevenLabs dashboard, navigate to the Speech to Text page and click the “Transcribe files” button. From the modal, you can upload an audio or video file to transcribe.

Speech to Text upload

2

Select options

Select the primary language of the audio and the maximum number of speakers. If you don’t know either, you can leave the defaults which will attempt to detect the language and number of speakers automatically.

Finally choose whether you wish to tag audio events like laughter or applause, then click the “Transcribe” button.

3

View results

Click on the name of the audio file you uploaded in the center pane to view the results. You can click on a word to start a playback of the audio at that point.

Click the “Export” button in the top right to download the results in a variety of formats.

Transcript Editor

Once you’ve created a transcript, you can edit it in our Transcript Editor. Learn more about it in this guide.

FAQ

What languages are supported?

Supported languages

The Scribe v1 model supports 99 languages, including:

Afrikaans (afr), Amharic (amh), Arabic (ara), Armenian (hye), Assamese (asm), Asturian (ast), Azerbaijani (aze), Belarusian (bel), Bengali (ben), Bosnian (bos), Bulgarian (bul), Burmese (mya), Cantonese (yue), Catalan (cat), Cebuano (ceb), Chichewa (nya), Croatian (hrv), Czech (ces), Danish (dan), Dutch (nld), English (eng), Estonian (est), Filipino (fil), Finnish (fin), French (fra), Fulah (ful), Galician (glg), Ganda (lug), Georgian (kat), German (deu), Greek (ell), Gujarati (guj), Hausa (hau), Hebrew (heb), Hindi (hin), Hungarian (hun), Icelandic (isl), Igbo (ibo), Indonesian (ind), Irish (gle), Italian (ita), Japanese (jpn), Javanese (jav), Kabuverdianu (kea), Kannada (kan), Kazakh (kaz), Khmer (khm), Korean (kor), Kurdish (kur), Kyrgyz (kir), Lao (lao), Latvian (lav), Lingala (lin), Lithuanian (lit), Luo (luo), Luxembourgish (ltz), Macedonian (mkd), Malay (msa), Malayalam (mal), Maltese (mlt), Mandarin Chinese (zho), Māori (mri), Marathi (mar), Mongolian (mon), Nepali (nep), Northern Sotho (nso), Norwegian (nor), Occitan (oci), Odia (ori), Pashto (pus), Persian (fas), Polish (pol), Portuguese (por), Punjabi (pan), Romanian (ron), Russian (rus), Serbian (srp), Shona (sna), Sindhi (snd), Slovak (slk), Slovenian (slv), Somali (som), Spanish (spa), Swahili (swa), Swedish (swe), Tamil (tam), Tajik (tgk), Telugu (tel), Thai (tha), Turkish (tur), Ukrainian (ukr), Umbundu (umb), Urdu (urd), Uzbek (uzb), Vietnamese (vie), Welsh (cym), Wolof (wol), Xhosa (xho) and Zulu (zul).

Can I upload video files?

Yes, the tool supports uploading both audio and video files. The maximum file size for either is 3GB.

Can I rename speakers?

Renaming speakers

Yes, you can rename speakers by clicking the “edit” button next to the “Speakers” label.

Was this page helpful?
Previous

Studio

Create professional video and audio content with our end-to-end production workflow
Next
Built with
Speech to Text

Create professional video and audio content with our end-to-end production workflow