Speech-to-text¶
The cxcli
tool has various commands that allow you to interact with Google Cloud's Speech to Text service using the Cloud Speech-to-text API
!
Is this your first time using this feature?
Before you start using this functionality, please, read the authentication page.
Usage¶
You can find the speech-to-text functionality within the cxcli stt
subcommand. You can read the documentation about this command here.
The cxcli stt
command has a recognize
subcommand. You can find the usage of this command here.
Parameters¶
These are the relevant parameters that you can use to interact with Google Cloud STT:
locale
: this parameter accepts all of the locales that are available in the Google CloudSpeech-to-text API
. You can find all the locales available here.
Audio input file¶
It is important to know that the input audio needs to be in the following format:
- A sample rate of 16000 Hertz
- The audio encoding has to be Linear16. Linear16 is a 16-bit linear pulse-code modulation (PCM) encoding.
If you don't have a file with this format, you can create it by yourself using the cxcli tts
command! All of the relevant information is located here.
Example¶
Here is a simple example of the cxcli stt recognize
command:
The above command will give you output similar to the following:
$ cxcli stt recognize hi.mp3 --locale en-US --verbose
INFO Duration time: 570 miliseconds
INFO Detections: 1
INFO 1. Text detected: hi
INFO 1. Confidence: 79.276474%
Are you running this command in a CI/CD pipeline?
If this is the case, we recommend that you set the --output-format
parameter to json
.