cxcli tool has various commands that allow you to interact with Google Cloud's Speech to Text service using the
Cloud Speech-to-text API!
Is this your first time using this feature?
Before you start using this functionality, please, read the authentication page.
You can find the speech-to-text functionality within the
cxcli stt subcommand. You can read the documentation about this command here.
cxcli stt command has a
recognize subcommand. You can find the usage of this command here.
These are the relevant parameters that you can use to interact with Google Cloud STT:
locale: this parameter accepts all of the locales that are available in the Google Cloud
Speech-to-text API. You can find all the locales available here.
Audio input file¶
It is important to know that the input audio needs to be in the following format:
- A sample rate of 16000 Hertz
- The audio encoding has to be Linear16. Linear16 is a 16-bit linear pulse-code modulation (PCM) encoding.
If you don't have a file with this format, you can create it by yourself using the
cxcli tts command! All of the relevant information is located here.
Here is a simple example of the
cxcli stt recognize command:
The above command will give you output similar to the following:
Are you running this command in a CI/CD pipeline?
If this is the case, we recommend that you set the
--output-format parameter to