cxcli has some commands that allows you to interact with Google Cloud Text to Speech service using the
Cloud Speech-to-text API!
Is this your first time using this feature?
Before you start using this functionality, please, read the authentication page.
You can find the speech-to-text commands usage down the
cxcli stt command. You can read the documentation about this command here.
cxcli stt root command has the
recognize command. You can find the usage of this command here.
These are the relevant parameters that you can use to interact with Google Cloud stt:
locale: the locale accepts all the locales accepted by the Google
Cloud Speech-to-text API. You can find all the locales available here
Audio input file¶
It is important to know that the input has to have this format:
- A Sample Rate Hertz of 16000Hz
- The audio encoding has to be be Linear16. Linear16 is a 16-bit linear pulse-code modulation (PCM) encoding.
If you don't have a file with this format, you can create it by yourself using the
cxcli tts command! All the information is located here
This a simple example of the
cxcli stt recognize command:
cxcli stt recognize hi.mp3 --locale en-US
The command above will give you an audio file like this one:
$ cxcli stt recognize hi.mp3 --locale en-US --verbose INFO Duration time: 570 miliseconds INFO Detections: 1 INFO 1. Text detected: hi INFO 1. Confidence: 79.276474%
are you running this command in a CICD pipeline?
If this is the case, we recommend you to execute with the
--output-format parameter set to