Need help with google-speech-v2?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

447 Stars 90 Forks 4 Commits 5 Opened issues


:speech_balloon: Reverse Engineering Google's Speech To Text API (v2)

Services available


Need anything else?

Contributors list

# 33,527
3 commits
# 1,844
1 commit

Google Speech API v2:


Google has since launched it's official Google Cloud Speech API. I strongly recommend looking over there.



output: json, xml not supported.

lang: any valid locale (en-us, nl-be, fr-fr, etc.)

key: Please get one from the Google Developers Console

Key is not optional.

app: optional

You can specify an optional query string called

, which returns some extra transcripts for some reason.

client: optional, seems to do nothing in particular



Flac file; 44100Hz 32bit float, exported with Audacity. Check the audio folder in this repository for some hilarious examples.

Channels       : 2
Sample Rate    : 44100
Precision      : 32-bit
Sample Encoding: 32-bit Float

16-bit PCM

The following audio options are confirmed working for 16-bit PCM sample encoding:

Channels       : 1
Sample Rate    : 16000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

One-line sox recording command:

rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav



Content-Type: audio/x-flac; rate=44100;

Set the rate to be equal to the rate of the FLAC file (generally 44100Hz) but it supports different rates.

Content-Type: audio/l16; rate=16000;
is also supported with a rate of 44100Hz or 16000Hz for files encoded with LPCM 16-bit signed-integer.

NOTE: Make sure the rate in your header matches the sample rate you used for your audio capture.


not required, but for spoofing purposes use one of Chrome’s userAgent strings.


When Google is 100% confident in it's translation, it will return the following object:

               "transcript":"good morning Google how are you feeling today"

When it's doubtful, it adds a confidence parameter for you. It also seems to add multiple transcripts for some reason.

          "transcript":"this is a test",
          "transcript":"this is a test for"


Install sox

On OS X with Homebrew installed:

brew install sox

Record audio

rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav

Send the request

curl -X POST \
--data-binary @'audio/hello (16bit PCM).wav' \
--header 'Content-Type: audio/l16; rate=16000;' \

Or for FLAC encoded audio:

curl -X POST \
--data-binary @audio/good-morning-google.flac \
--header 'Content-Type: audio/x-flac; rate=44100;' \


Here are a few caveats you have to know about, should you decide to use this API in a production environment. (I don't recommend it)

  • The API only accepts up to ~10-15 seconds of audio.
  • Generating your own Speech API Key, you can only make 50 requests per day.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.