Android SDK
The Android SDK provides APIs to easily integrate Speech Recognition features into Android apps. The API automatically handles websocket access to speech server, audio capture, encoding, trasmission and transcription retrieval in real-time.
It gives you three standard ways to interact in app with the ASR engine:
- Recognition Intent, the simplest way which embeds UI, shows partial results and sound levels, handles speech recognition resources and returns the final results.
- Recognition service using SpeechRecognizer class, which provides callbacks to handle partial and final results and sound levels.
- Embedded Recognition service using SpeechRecognizer class, which provides offline keyphrase recognition, usually used to trig one of the former voice recognition component.
The SDK requires Android 4.0.3 (API level 15) or higher.
Get Started
- Download the latest SDK version here.
- In Android Studio, choose File → New → New Module → Import JAR/AAR Package. Click Next and select the downloaded file CedatSTT.aar.
- Open your application module Gradle file build.gradle and add the following line to the dependencies block:
- Check the library presence in the settings.gradle file:

dependencies {
implementation project(':CedatSTT')
...
}
include ':app',':CedatSTT'
Recognition Intent
Call the method startActivityForResult()
defining an explicit intent with class RecognitionActivity.class
and set the extra parameter Cedat85Recognizer.EXTRA_API_KEY
.
In the callback onActivityResult()
you can handle the transcriptions list in RecognizerIntent.EXTRA_RESULTS
and the confidence values array in RecognizerIntent.EXTRA_CONFIDENCE_SCORES
.
The following code snippet shows how to use the library for online recognition via intent:
private static final int SPEECH_REQUEST_CODE = 0;
private void startVoiceRecognitionActivity() {
Intent intent = new Intent(this, RecognitionActivity.class);
intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
startActivityForResult(intent, SPEECH_REQUEST_CODE);
}
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
if (requestCode == SPEECH_REQUEST_CODE && resultCode == RESULT_OK) {
ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
float[] confidence = data.getFloatArrayExtra(RecognizerIntent.EXTRA_CONFIDENCE_SCORES);
// log transcription result and confidence value
Log.i("STT", result.get(0) + ":" + confidence[0]);
}
Recognition service
Use the SpeechRecognizer
class to access to the speech recognition service. This class's methods must be invoked only from the main application thread.
Call the static factory method createSpeechRecognizer()
passing the component named OnlineRecognitionService.class
, set your RecognitionListener
implementation and start the recognition service using startListening()
, providing extra parameters in the recognizer Intent. The extra parameter Cedat85Recognizer.EXTRA_API_KEY
is mandatory.
Overriding the callbacks onPartialResults()
, onResults()
you can handle the transcriptions list in SpeechRecognizer.RESULTS_RECOGNITION
and the confidence values array in SpeechRecognizer.CONFIDENCE_SCORES
.
The speech recognition process automatically stops itself on endpoint. To prematurely stop or cancel the recognition process, use the SpeechRecognizer methods stopListening()
or cancel()
. Finally, to release speech recognition resources and avoid memory leaks, call destroy()
in onResults()
method of the RecognitionListener
.
The following code snippet shows how to use the library for online recognition via service:
private SpeechRecognizer speechRecognizer;
private RecognitionListener recognitionListener;
private void startOnlineVoiceRecognitionService() {
Intent intent = new Intent();
intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
recognitionListener = new RecognitionListener() {
...
@Override
public void onResults(Bundle data) {
ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
float[] confidence = data.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
Log.i(tag, "result: " + result.get(0) + ":" + confidence[0]);
speechRecognizer.destroy();
}
@Override
public void onPartialResults(Bundle data) {
ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
Log.i(tag, "partial result: " + result.get(0));
}
...
};
speechRecognizer.setRecognitionListener(recognitionListener);
speechRecognizer.startListening(intent);
}
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
speechRecognizer = SpeechRecognizer.createSpeechRecognizer(
this, new ComponentName(getApplicationContext(), OnlineRecognitionService.class));
startOnlineVoiceRecognitionService();
}
@Override
public void onPause() {
super.onPause();
if (speechRecognizer != null) {
speechRecognizer.cancel();
speechRecognizer.destroy();
}
}
Embedded Recognition Service
Speech-i STT library includes API for offline continuous keyphrase recognition, usually used to trig online voice recognition.
Use the SpeechRecognizer
class to access to the speech recognition service. This class's methods must be invoked only from the main application thread.
Call the static factory method createSpeechRecognizer()
passing the component named OfflineRecognitionService.class
, set your RecognitionListener
implementation and start the recognition service using startListening()
, providing extra parameters in the recognizerIntent. The extra parameter Cedat85Recognizer.EXTRA_API_KEY
is mandatory. You can customize the keyphrase using Cedat85Recognizer.EXTRA_KEY_PHRASE
(only English words supported) and the recognition sensitivity with Cedat85Recognizer.EXTRA_THRESHOLD
. For better recognition is suggested to use 2-3 words for the key phrase.
The callback onResults()
will be invoked on match. Partial results will not be provided.
The speech recognition process automatically stops itself on success. To prematurely stop or cancel the recognition process, use the SpeechRecognizer methods stopListening()
or cancel()
. Finally, to release speech recognition resources and avoid memory leaks, use destroy()
method.
The following code snippet shows how to use the library for offline recognition via service:
private SpeechRecognizer offlineSpeechRecognizer;
private RecognitionListener recognitionListener;
private void startOfflineVoiceRecognitionService() {
Intent intent = new Intent();
intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
recognitionListener = new RecognitionListener() {
...
@Override
public void onResults(Bundle data) {
ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
float[] confidence = data.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
Log.i(tag, "result: " + result.get(0) + ":" + confidence[0]);
offlineSpeechRecognizer.destroy();
}
...
};
offlineSpeechRecognizer.setRecognitionListener(recognitionListener);
offlineSpeechRecognizer.startListening(intent);
}
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
offlineSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this,
new ComponentName(getApplicationContext(), OfflineRecognitionService.class));
startOfflineVoiceRecognitionService();
}
@Override
public void onPause() {
super.onPause();
if (offlineSpeechRecognizer != null) {
offlineSpeechRecognizer.cancel();
offlineSpeechRecognizer.destroy();
}
}
Configuration
The library can be easily customized specifying the extra parameters defined in the Cedat85Recognizer
class using the Intent class method putExtra().
The only mandatory parameter is Cedat85Recognizer.EXTRA_API_KEY
, which must be filled with the api key provided by Speech-i.
Other parameters are optional and provide a default value if not specified.
The following table shows the customizable parameter from class Cedat85Recognizer
.
PARAMETER | TYPE | DEFAULT | DESCRIPTION |
---|---|---|---|
EXTRA_API_KEY | String | Mandatory parameter | Mandatory api key provided by Speech-i |
EXTRA_PROMPT | String | Device language message “Speak now” | Prompt for the user |
EXTRA_MAX_RESULTS | int | 1 | Max number of results |
EXTRA_MIN_SILENCE | float | 2.0f | Silence seconds to terminate the speech recognition |
EXTRA_DECODER_URL | String | DEFAULT_DECODER_URL | Server URL. To enable SSL connection, use DEFAULT_DECODER_URL_HTTPS |
EXTRA_LANGUAGE_MODEL | String | Device language or LANGUAGE_MODEL_en_US_8k if it is not supported | Language model. Demo 8kHz values: LANGUAGE_MODEL_it_IT_8k LANGUAGE_MODEL_en_GB_8k LANGUAGE_MODEL_en_US_8k Demo 16kHz values: LANGUAGE_MODEL_it_IT_16k LANGUAGE_MODEL_en_GB_16k LANGUAGE_MODEL_en_US_16k LANGUAGE_MODEL_es_ES_16k LANGUAGE_MODEL_pt_BR_16k LANGUAGE_MODEL_pt_PT_16k |
EXTRA_KEY_PHRASE | String | “OK Cedat” | Keyphrase used in Embedded speech recognition |
EXTRA_THRESHOLD | float | THRESHOLD_SENSITIVITY_HIGH = 1.0e-35f | Sensitivity threshold used in Embedded speech recognition. |