Skip to content

Commit

Permalink
Integrate react-native-voice into expo-stt (#2)
Browse files Browse the repository at this point in the history
- Currently,
[anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt) has a
separate Google voice recognition modal. Instead, I migrated the
[react-native-voice](https://github.com/react-native-voice/voice) code
onto this repository to use the built-in microphone like
[react-native-voice](https://github.com/react-native-voice/voice).
- You can check the whole flow of voice recognition with
[README](https://github.com/crossplatformkorea/expo-stt/blob/52d320d9a7ec6fa3ff7bafe41f9679926e290093/README.md)

---------

Co-authored-by: hyochan <[email protected]>
  • Loading branch information
daheeahn and hyochan authored Jul 22, 2024
1 parent 05a2a37 commit bd8fa61
Show file tree
Hide file tree
Showing 78 changed files with 10,741 additions and 55,287 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ project.xcworkspace

# Android/IJ
#
build/
.classpath
.cxx
.gradle
Expand Down
74 changes: 45 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,51 @@
# expo-stt

Unofficial Speech To Text module for Expo which supported iOS and Android
- Unofficial Speech To Text module for Expo which supported iOS and Android
- Forked [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt)
- Migrated [react-native-voice functionality](https://github.com/react-native-voice/voice) on [crossplatformkorea/expo-stt](https://github.com/crossplatformkorea/expo-stt), which is forked from [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt)
- Currently, [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt) has a separate Google voice recognition modal. Instead, I migrated the [react-native-voice](https://github.com/react-native-voice/voice) code onto [crossplatformkorea/expo-stt](https://github.com/crossplatformkorea/expo-stt), which was created with the [expo module](https://docs.expo.dev/modules/overview), to use the built-in microphone like [react-native-voice](https://github.com/react-native-voice/voice).

So sorry that I am unemployed and don't have much money to spend more time to make this module work also for web.
# Sequence Diagram

If you still want to support web platform, please follow this article https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API
Below is a sequence diagram explaining how each module, including SpeechRecognizer, works.

![Demo speech to text](demo.png "Demo Speech To Text")
![Sequence Diagram](sequence-diagram.png)

And below is the [mermaid](https://mermaid.js.org) code to create the above diagram.

```mermaid
sequenceDiagram
participant User
participant ExpoSttModule
participant SpeechRecognizer
participant ReactNative as React Native Module
# API documentation
User->>ExpoSttModule: startSpeech()
ExpoSttModule->>SpeechRecognizer: createSpeechRecognizer()
ExpoSttModule->>SpeechRecognizer: startListening()
SpeechRecognizer-->>ExpoSttModule: onReadyForSpeech
- [Documentation for the main branch](https://github.com/expo/expo/blob/main/docs/pages/versions/unversioned/sdk/stt.md)
- [Documentation for the latest stable release](https://docs.expo.dev/versions/latest/sdk/stt/)
User->>SpeechRecognizer: User starts speaking
SpeechRecognizer-->>ExpoSttModule: onBeginningOfSpeech
ExpoSttModule->>ReactNative: sendEvent(onSpeechStart)
# Installation in managed Expo projects
User->>SpeechRecognizer: User finishes speaking
SpeechRecognizer-->>ExpoSttModule: onEndOfSpeech
ExpoSttModule->>ReactNative: sendEvent(onSpeechEnd)
For [managed](https://docs.expo.dev/versions/latest/introduction/managed-vs-bare/) Expo projects, please follow the installation instructions in the [API documentation for the latest stable release](#api-documentation). If you follow the link and there is no documentation available then this library is not yet usable within managed projects &mdash; it is likely to be included in an upcoming Expo SDK release.
SpeechRecognizer-->>ExpoSttModule: onResults
ExpoSttModule->>ReactNative: sendEvent(onSpeechResult)
# Installation in bare React Native projects
alt SpeechRecognizer encounters an error
SpeechRecognizer-->>ExpoSttModule: onError
ExpoSttModule->>ReactNative: sendEvent(onSpeechError)
end
```

# Demo

For bare React Native projects, you must ensure that you have [installed and configured the `expo` package](https://docs.expo.dev/bare/installing-expo-modules/) before continuing.
![Demo speech to text](demo.png "Demo Speech To Text")

### Add the package to your npm dependencies

Expand All @@ -39,6 +65,7 @@ npx expo prebuild --clean
### Configure for iOS (Bare React Native project only)

Run `npx pod-install` after installing the npm package.

## Add missing permissions for iOS

Add following key to plugins of `app.json` in Expo project
Expand Down Expand Up @@ -68,6 +95,7 @@ For Bare React Native project, you need to add these key to `Info.plist` in `ios
## Usage

Register some listeners

```
import * as ExpoStt from 'expo-stt';
Expand All @@ -82,10 +110,6 @@ Register some listeners
setSpokenText(value.join());
});
const onSpeechCancelled = ExpoStt.addOnSpeechCancelledListener(() => {
setRecognizing(false);
});
const onSpeechError = ExpoStt.addOnSpeechErrorListener(({ cause }) => {
setError(cause);
setRecognizing(false);
Expand All @@ -98,7 +122,6 @@ Register some listeners
return () => {
onSpeechStart.remove();
onSpeechResult.remove();
onSpeechCancelled.remove();
onSpeechError.remove();
onSpeechEnd.remove();
};
Expand All @@ -107,21 +130,14 @@ Register some listeners

There are some functions available to call such as:

* ExpoStt.startSpeech()
* ExpoStt.stopSpeech()
* ExpoStt.cancelSpeech()
* ExpoStt.destroySpeech()
* ExpoStt.requestRecognitionPermission()
* ExpoStt.checkRecognitionPermission()
- ExpoStt.startSpeech()
- ExpoStt.stopSpeech()
- ExpoStt.destroySpeech()
- ExpoStt.requestRecognitionPermission()
- ExpoStt.checkRecognitionPermission()

Take a look into `example/App.tsx` for completed example

# Contributing

Contributions are very welcome! Please refer to guidelines described in the [contributing guide]( https://github.com/expo/expo#contributing).

## Author

I am looking for a job as a React native developer, remote work is preferred.

Check out my CV: https://anhtuank7c.github.io
Contributions are very welcome! Please refer to guidelines described in the [contributing guide](https://github.com/expo/expo#contributing).
10 changes: 8 additions & 2 deletions android/src/main/AndroidManifest.xml
Original file line number Diff line number Diff line change
@@ -1,2 +1,8 @@
<manifest package="expo.modules.stt">
</manifest>
<manifest
xmlns:android="http://schemas.android.com/apk/res/android"
package="expo.modules.stt"
xmlns:tools="http://schemas.android.com/tools">
<uses-sdk tools:overrideLibrary="com.facebook.react" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />
</manifest>
176 changes: 132 additions & 44 deletions android/src/main/java/expo/modules/stt/ExpoSttModule.kt
Original file line number Diff line number Diff line change
@@ -1,41 +1,43 @@
package expo.modules.stt

import android.util.Log
import expo.modules.kotlin.activityresult.AppContextActivityResultLauncher
import expo.modules.kotlin.modules.Module
import expo.modules.kotlin.modules.ModuleDefinition
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.launch
import java.util.*
import android.speech.SpeechRecognizer
import android.speech.RecognizerIntent
import android.content.Intent
import android.speech.RecognitionListener
import android.os.Bundle
import expo.modules.kotlin.exception.CodedException
import android.Manifest
import android.content.pm.PackageManager
import androidx.core.app.ActivityCompat
import androidx.core.content.ContextCompat

class ExpoSttModule : Module() {
private lateinit var voiceRecognizer: AppContextActivityResultLauncher<VoiceRecognizerContractOptions, VoiceRecognizerContractResult>
class ExpoSttModule : Module(), RecognitionListener {
private var isRecognizing: Boolean = false
private var speech: SpeechRecognizer? = null

companion object {
const val onSpeechResult = "onSpeechResult"
const val onSpeechError = "onSpeechError"
const val onSpeechCancelled = "onSpeechCancelled"
const val onSpeechStart = "onSpeechStart"
const val onSpeechResult = "onSpeechResult"
const val onPartialResults = "onPartialResults"
const val onSpeechEnd = "onSpeechEnd"
const val onSpeechError = "onSpeechError"
const val TAG = "ExpoStt"
}

override fun definition() = ModuleDefinition {
Name(TAG)

/**
* We don't need any permission for Recognizer on Android
* Just act like iOS one to unify the APIs
*/
Events(onSpeechStart, onSpeechResult, onPartialResults, onSpeechEnd, onSpeechError)

AsyncFunction("requestRecognitionPermission") {
return@AsyncFunction mapOf(
"status" to "granted",
"expires" to "never",
"granted" to true,
"canAskAgain" to true
)
requestRecognitionPermission()
}

AsyncFunction("checkRecognitionPermission") {
Expand All @@ -48,30 +50,28 @@ class ExpoSttModule : Module() {
}

Function("startSpeech") {
if (!isPermissionGranted()) {
requestRecognitionPermission()
return@Function false
}

if (isRecognizing) {
sendEvent(onSpeechError, mapOf("cause" to "Speech recognition already started!"))
return@Function false
}

isRecognizing = true
sendEvent(onSpeechStart)
val options = VoiceRecognizerContractOptions(Locale.getDefault())

if (speech != null) {
speech?.destroy()
speech = null
}

CoroutineScope(Dispatchers.Main).launch {
when (val result = voiceRecognizer.launch(options)) {
is VoiceRecognizerContractResult.Success -> {
isRecognizing = false
sendEvent(onSpeechResult, mapOf("value" to result.value))
sendEvent(onSpeechEnd)
}
is VoiceRecognizerContractResult.Cancelled -> {
isRecognizing = false
sendEvent(onSpeechCancelled)
}
is VoiceRecognizerContractResult.Error -> {
isRecognizing = false
sendEvent(onSpeechError, mapOf("cause" to result.cause))
}
}
speech = SpeechRecognizer.createSpeechRecognizer(appContext.reactContext)
speech?.setRecognitionListener(this@ExpoSttModule)
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())

speech?.startListening(intent)
}
return@Function true
}
Expand All @@ -82,24 +82,112 @@ class ExpoSttModule : Module() {
* A bit different compare to iOS APIs
*/
Function("stopSpeech") {
if (speech != null) {
speech?.stopListening()
}
isRecognizing = false
Log.d(TAG, "Stop Voice Recognizer")
}

Function("cancelSpeech") {
Log.d(TAG, "Cancel Voice Recognizer")
return@Function null as Any
}

Function("destroySpeech") {
if (speech != null) {
speech?.destroy()
}
isRecognizing = false
Log.d(TAG, "Destroy Voice Recognizer")

return@Function null as Any
}

}

private fun requestRecognitionPermission() {
val currentActivity = appContext.currentActivity ?: throw CodedException("Activity is null")
val permission = Manifest.permission.RECORD_AUDIO
val isGranted = ContextCompat.checkSelfPermission(currentActivity, permission) == PackageManager.PERMISSION_GRANTED

if (!isGranted) {
ActivityCompat.requestPermissions(currentActivity, arrayOf(permission), 1)
}

mapOf(
"status" to if (isGranted) "granted" else "denied",
"expires" to "never",
"granted" to isGranted,
"canAskAgain" to true
)
}

private fun isPermissionGranted(): Boolean {
val permission = Manifest.permission.RECORD_AUDIO
val res = appContext.reactContext?.checkCallingOrSelfPermission(permission)
return res == PackageManager.PERMISSION_GRANTED
}

override fun onReadyForSpeech(params: Bundle?) {
Log.d(TAG, "onReadyForSpeech")
}

override fun onBeginningOfSpeech() {
isRecognizing = true
sendEvent(onSpeechStart)
Log.d(TAG, "onBeginningOfSpeech")
}

override fun onResults(results: Bundle?) {
isRecognizing = false
val matches = results?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)

if (matches != null) {
sendEvent(onSpeechResult, mapOf("results" to matches))
} else {
sendEvent(onSpeechError, mapOf("errorMessage" to "No speech results"))
}

RegisterActivityContracts {
voiceRecognizer =
registerForActivityResult(VoiceRecognizerContract()) { _, _ ->
Log.d(TAG, "handleResultUponActivityDestruction")
}
Log.d(TAG, "onResults $matches")
}

override fun onPartialResults(partialResults: Bundle?) {
val matches = partialResults?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
// Log.d(TAG, "onPartialResults $matches")
}

override fun onEndOfSpeech() {
isRecognizing = false
sendEvent(onSpeechEnd)
Log.d(TAG, "onEndOfSpeech")
}

override fun onError(error: Int) {
isRecognizing = false
val errorMessage = when (error) {
SpeechRecognizer.ERROR_AUDIO -> "Audio recording error"
SpeechRecognizer.ERROR_CLIENT -> "Client side error"
SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS -> "Insufficient permissions"
SpeechRecognizer.ERROR_NETWORK -> "Network error"
SpeechRecognizer.ERROR_NETWORK_TIMEOUT -> "Network timeout"
SpeechRecognizer.ERROR_NO_MATCH -> "No match"
SpeechRecognizer.ERROR_RECOGNIZER_BUSY -> "RecognitionService busy"
SpeechRecognizer.ERROR_SERVER -> "Error from server"
SpeechRecognizer.ERROR_SPEECH_TIMEOUT -> "No speech input"
else -> "Unknown error"
}

sendEvent(onSpeechError, mapOf("errorMessage" to errorMessage))
Log.d(TAG, "onError: $error $errorMessage")
}

override fun onBufferReceived(buffer: ByteArray?) {
Log.d(TAG, "onBufferReceived")
}

override fun onRmsChanged(rmsdB: Float) {
// Log.d(TAG, "onRmsChanged: $rmsdB")
}

Events(onSpeechResult, onSpeechError, onSpeechEnd, onSpeechStart, onSpeechCancelled)
override fun onEvent(eventType: Int, params: Bundle?) {
Log.d(TAG, "onEvent: $eventType")
}
}
14 changes: 0 additions & 14 deletions build/ExpoStt.types.d.ts

This file was deleted.

1 change: 0 additions & 1 deletion build/ExpoStt.types.d.ts.map

This file was deleted.

Loading

0 comments on commit bd8fa61

Please sign in to comment.