Speech-to-text¶
cxcli
has some commands that allows you to interact with Google Cloud Text to Speech service using the Cloud Speech-to-text API
!
Is this your first time using this feature?
Before you start using this functionality, please, read the authentication page.
Usage¶
You can find the speech-to-text commands usage down the cxcli stt
command. You can read the documentation about this command here.
The cxcli stt
root command has the recognize
command. You can find the usage of this command here.
Parameters¶
These are the relevant parameters that you can use to interact with Google Cloud stt:
locale
: the locale accepts all the locales accepted by the GoogleCloud Speech-to-text API
. You can find all the locales available here
Audio input file¶
It is important to know that the input has to have this format:
- A Sample Rate Hertz of 16000Hz
- The audio encoding has to be be Linear16. Linear16 is a 16-bit linear pulse-code modulation (PCM) encoding.
If you don't have a file with this format, you can create it by yourself using the cxcli tts
command! All the information is located here
Example¶
This a simple example of the cxcli stt recognize
command:
cxcli stt recognize hi.mp3 --locale en-US
The command above will give you an audio file like this one:
$ cxcli stt recognize hi.mp3 --locale en-US --verbose
INFO Duration time: 570 miliseconds
INFO Detections: 1
INFO 1. Text detected: hi
INFO 1. Confidence: 79.276474%
are you running this command in a CICD pipeline?
If this is the case, we recommend you to execute with the --output-format
parameter set to json
.