Speech-i

The Android SDK provides APIs to easily integrate Speech Recognition features into Android apps. The API automatically handles websocket access to speech server, audio capture, encoding, trasmission and transcription retrieval in real-time.

It gives you three standard ways to interact in app with the ASR engine:

Recognition Intent, the simplest way which embeds UI, shows partial results and sound levels, handles speech recognition resources and returns the final results.
Recognition service using SpeechRecognizer class, which provides callbacks to handle partial and final results and sound levels.
Embedded Recognition service using SpeechRecognizer class, which provides offline keyphrase recognition, usually used to trig one of the former voice recognition component.

The SDK requires Android 4.0.3 (API level 15) or higher.

Download the latest SDK version here.

In Android Studio, choose File → New → New Module → Import JAR/AAR Package. Click Next and select the downloaded file CedatSTT.aar.

Open your application module Gradle file build.gradle and add the following line to the dependencies block:

dependencies {
  implementation project(':CedatSTT')
  ...
}

Check the library presence in the settings.gradle file:


			include ':app',':CedatSTT'

Call the method startActivityForResult() defining an explicit intent with class RecognitionActivity.class and set the extra parameter Cedat85Recognizer.EXTRA_API_KEY.

In the callback onActivityResult() you can handle the transcriptions list in RecognizerIntent.EXTRA_RESULTS and the confidence values array in RecognizerIntent.EXTRA_CONFIDENCE_SCORES.

The following code snippet shows how to use the library for online recognition via intent:

private static final int SPEECH_REQUEST_CODE = 0;

private void startVoiceRecognitionActivity() {
  Intent intent = new Intent(this, RecognitionActivity.class);
  intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
  startActivityForResult(intent, SPEECH_REQUEST_CODE);
}

@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
  if (requestCode == SPEECH_REQUEST_CODE && resultCode == RESULT_OK) {
    ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
    float[] confidence = data.getFloatArrayExtra(RecognizerIntent.EXTRA_CONFIDENCE_SCORES);
    // log transcription result and confidence value
    Log.i("STT", result.get(0) + ":" + confidence[0]);
}

Recognition intent requires RECORD_AUDIO and INTERNET permissions.

Use the SpeechRecognizer class to access to the speech recognition service. This class's methods must be invoked only from the main application thread.

Call the static factory method createSpeechRecognizer() passing the component named OnlineRecognitionService.class, set your RecognitionListener implementation and start the recognition service using startListening(), providing extra parameters in the recognizer Intent. The extra parameter Cedat85Recognizer.EXTRA_API_KEY is mandatory.

Overriding the callbacks onPartialResults(), onResults() you can handle the transcriptions list in SpeechRecognizer.RESULTS_RECOGNITION and the confidence values array in SpeechRecognizer.CONFIDENCE_SCORES.

The speech recognition process automatically stops itself on endpoint. To prematurely stop or cancel the recognition process, use the SpeechRecognizer methods stopListening() or cancel(). Finally, to release speech recognition resources and avoid memory leaks, call destroy() in onResults() method of the RecognitionListener.

The following code snippet shows how to use the library for online recognition via service:

private SpeechRecognizer speechRecognizer;
private RecognitionListener recognitionListener;

private void startOnlineVoiceRecognitionService() {
  Intent intent = new Intent();
  intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
  recognitionListener = new RecognitionListener() {
    ...
    @Override
    public void onResults(Bundle data) {
        ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
        float[] confidence = data.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
        Log.i(tag, "result: " + result.get(0) + ":" + confidence[0]);
        speechRecognizer.destroy();
       }

    @Override
    public void onPartialResults(Bundle data) {
       ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
   Log.i(tag, "partial result: " + result.get(0));
       }
  ...
  };
  speechRecognizer.setRecognitionListener(recognitionListener);
  speechRecognizer.startListening(intent);
}

@Override
public void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
  speechRecognizer = SpeechRecognizer.createSpeechRecognizer(
  	this, new ComponentName(getApplicationContext(), OnlineRecognitionService.class));
  startOnlineVoiceRecognitionService();
}

@Override
public void onPause() {
    super.onPause();
    if (speechRecognizer != null) {
        speechRecognizer.cancel();
        speechRecognizer.destroy();
   }
}

Recognition service requires RECORD_AUDIO and INTERNET permissions.

Speech-i STT library includes API for offline continuous keyphrase recognition, usually used to trig online voice recognition.

Use the SpeechRecognizer class to access to the speech recognition service. This class's methods must be invoked only from the main application thread.

Call the static factory method createSpeechRecognizer() passing the component named OfflineRecognitionService.class, set your RecognitionListener implementation and start the recognition service using startListening(), providing extra parameters in the recognizerIntent. The extra parameter Cedat85Recognizer.EXTRA_API_KEY is mandatory. You can customize the keyphrase using Cedat85Recognizer.EXTRA_KEY_PHRASE (only English words supported) and the recognition sensitivity with Cedat85Recognizer.EXTRA_THRESHOLD. For better recognition is suggested to use 2-3 words for the key phrase.

The callback onResults() will be invoked on match. Partial results will not be provided.

The speech recognition process automatically stops itself on success. To prematurely stop or cancel the recognition process, use the SpeechRecognizer methods stopListening() or cancel(). Finally, to release speech recognition resources and avoid memory leaks, use destroy() method.

The following code snippet shows how to use the library for offline recognition via service:

private SpeechRecognizer offlineSpeechRecognizer;
private RecognitionListener recognitionListener;
private void startOfflineVoiceRecognitionService() {
  Intent intent = new Intent();
  intent.putExtra(Cedat85Recognizer.EXTRA_API_KEY, "YOUR_API_KEY");
  recognitionListener = new RecognitionListener() {
    ...
    @Override
    public void onResults(Bundle data) {
          ArrayList result = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
          float[] confidence = data.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
          Log.i(tag, "result: " + result.get(0) + ":" + confidence[0]);
          offlineSpeechRecognizer.destroy();
         }
  ...
  };
  offlineSpeechRecognizer.setRecognitionListener(recognitionListener);
  offlineSpeechRecognizer.startListening(intent);
}

@Override
public void onCreate(Bundle savedInstanceState) {
     super.onCreate(savedInstanceState);
  offlineSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this,
  new ComponentName(getApplicationContext(), OfflineRecognitionService.class));
  startOfflineVoiceRecognitionService();
}
@Override
public void onPause() {
  super.onPause();
    if (offlineSpeechRecognizer != null) {
     offlineSpeechRecognizer.cancel();
     offlineSpeechRecognizer.destroy();
  }
}

Embedded Recognition service requires RECORD_AUDIO and WRITE_EXTERNAL_STORAGE permissions.

The library can be easily customized specifying the extra parameters defined in the Cedat85Recognizer class using the Intent class method putExtra().

The only mandatory parameter is Cedat85Recognizer.EXTRA_API_KEY, which must be filled with the api key provided by Speech-i.

Other parameters are optional and provide a default value if not specified.

The following table shows the customizable parameter from class Cedat85Recognizer.

PARAMETER	TYPE	DEFAULT	DESCRIPTION
EXTRA_API_KEY	String	Mandatory parameter	Mandatory api key provided by Speech-i
EXTRA_PROMPT	String	Device language message “Speak now”	Prompt for the user
EXTRA_MAX_RESULTS	int	1	Max number of results
EXTRA_MIN_SILENCE	float	2.0f	Silence seconds to terminate the speech recognition
EXTRA_DECODER_URL	String	DEFAULT_DECODER_URL	Server URL. To enable SSL connection, use DEFAULT_DECODER_URL_HTTPS
EXTRA_LANGUAGE_MODEL	String	Device language or LANGUAGE_MODEL_en_US_8k if it is not supported	Language model. Demo 8kHz values: LANGUAGE_MODEL_it_IT_8k LANGUAGE_MODEL_en_GB_8k LANGUAGE_MODEL_en_US_8k Demo 16kHz values: LANGUAGE_MODEL_it_IT_16k LANGUAGE_MODEL_en_GB_16k LANGUAGE_MODEL_en_US_16k LANGUAGE_MODEL_es_ES_16k LANGUAGE_MODEL_pt_BR_16k LANGUAGE_MODEL_pt_PT_16k
EXTRA_KEY_PHRASE	String	“OK Cedat”	Keyphrase used in Embedded speech recognition
EXTRA_THRESHOLD	float	THRESHOLD_SENSITIVITY_HIGH = 1.0e-35f	Sensitivity threshold used in Embedded speech recognition.

Android SDK

Get Started

Recognition Intent

Recognition service

Embedded Recognition Service

Configuration