Custom Speech Recognition API
From version v3.56, you can use Gladia's speech recognition service in this custom speech recognition channel. Please refer to this tutorial for specific usage
If you are not satisfied with the existing speech recognition methods, you can also customize your own speech recognition API. Just fill in the relevant information in Menu - Speech Recognition Settings - Custom Speech Recognition API.
Fill in your API address, starting with http. The system will send WAV format audio data with the key name "audio", a sampling rate of 16k, and 1 channel to the API address you fill in. If your API requires key verification, fill in the relevant password in the key box. This password will be appended to the API address and sent as sk=password
.
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
Your API needs to return JSON format data. Set the code to 1 and msg to the reason for recognition failure.
Return on failure:
res={
"code":1,
"msg":"Reason for error"
}
Return on success:
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
...multiple
]
}
As follows, if a key password value is filled in, it will be appended to the api_url and sent. api_url?sk=the sk value filled in
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
#Return on failure
res={
"code":1,
"msg":"Reason for error"
}
#Return on success
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
]
}