Integrate react-native-voice into expo-stt (#2)

- Currently, [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt) has a separate Google voice recognition modal. Instead, I migrated the [react-native-voice](https://github.com/react-native-voice/voice) code onto this repository to use the built-in microphone like [react-native-voice](https://github.com/react-native-voice/voice). - You can check the whole flow of voice recognition with [README](https://github.com/crossplatformkorea/expo-stt/blob/52d320d9a7ec6fa3ff7bafe41f9679926e290093/README.md) --------- Co-authored-by: hyochan <[email protected]>
crossplatformkorea · Jul 22, 2024 · bd8fa61 · bd8fa61
1 parent 05a2a37
commit bd8fa61
Show file tree

Hide file tree

Showing 78 changed files with 10,741 additions and 55,287 deletions.
diff --git a/.gitignore b/.gitignore
@@ -29,6 +29,7 @@ project.xcworkspace
 
 # Android/IJ
 #
+build/
 .classpath
 .cxx
 .gradle

diff --git a/README.md b/README.md
@@ -1,25 +1,51 @@
 # expo-stt
 
-Unofficial Speech To Text module for Expo which supported iOS and Android
+- Unofficial Speech To Text module for Expo which supported iOS and Android
+- Forked [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt)
+- Migrated [react-native-voice functionality](https://github.com/react-native-voice/voice) on [crossplatformkorea/expo-stt](https://github.com/crossplatformkorea/expo-stt), which is forked from [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt)
+- Currently, [anhtuank7c/expo-stt](https://github.com/anhtuank7c/expo-stt) has a separate Google voice recognition modal. Instead, I migrated the [react-native-voice](https://github.com/react-native-voice/voice) code onto [crossplatformkorea/expo-stt](https://github.com/crossplatformkorea/expo-stt), which was created with the [expo module](https://docs.expo.dev/modules/overview), to use the built-in microphone like [react-native-voice](https://github.com/react-native-voice/voice).
 
-So sorry that I am unemployed and don't have much money to spend more time to make this module work also for web.
+# Sequence Diagram
 
-If you still want to support web platform, please follow this article https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API
+Below is a sequence diagram explaining how each module, including SpeechRecognizer, works.
 
-![Demo speech to text](demo.png "Demo Speech To Text")
+![Sequence Diagram](sequence-diagram.png)
+
+And below is the [mermaid](https://mermaid.js.org) code to create the above diagram.
+
+```mermaid
+
+sequenceDiagram
+    participant User
+    participant ExpoSttModule
+    participant SpeechRecognizer
+    participant ReactNative as React Native Module
 
-# API documentation
+    User->>ExpoSttModule: startSpeech()
+    ExpoSttModule->>SpeechRecognizer: createSpeechRecognizer()
+    ExpoSttModule->>SpeechRecognizer: startListening()
+    SpeechRecognizer-->>ExpoSttModule: onReadyForSpeech
 
-- [Documentation for the main branch](https://github.com/expo/expo/blob/main/docs/pages/versions/unversioned/sdk/stt.md)
-- [Documentation for the latest stable release](https://docs.expo.dev/versions/latest/sdk/stt/)
+    User->>SpeechRecognizer: User starts speaking
+    SpeechRecognizer-->>ExpoSttModule: onBeginningOfSpeech
+    ExpoSttModule->>ReactNative: sendEvent(onSpeechStart)
 
-# Installation in managed Expo projects
+    User->>SpeechRecognizer: User finishes speaking
+    SpeechRecognizer-->>ExpoSttModule: onEndOfSpeech
+    ExpoSttModule->>ReactNative: sendEvent(onSpeechEnd)
 
-For [managed](https://docs.expo.dev/versions/latest/introduction/managed-vs-bare/) Expo projects, please follow the installation instructions in the [API documentation for the latest stable release](#api-documentation). If you follow the link and there is no documentation available then this library is not yet usable within managed projects &mdash; it is likely to be included in an upcoming Expo SDK release.
+    SpeechRecognizer-->>ExpoSttModule: onResults
+    ExpoSttModule->>ReactNative: sendEvent(onSpeechResult)
 
-# Installation in bare React Native projects
+    alt SpeechRecognizer encounters an error
+        SpeechRecognizer-->>ExpoSttModule: onError
+        ExpoSttModule->>ReactNative: sendEvent(onSpeechError)
+    end
+```
+
+# Demo
 
-For bare React Native projects, you must ensure that you have [installed and configured the `expo` package](https://docs.expo.dev/bare/installing-expo-modules/) before continuing.
+![Demo speech to text](demo.png "Demo Speech To Text")
 
 ### Add the package to your npm dependencies
 
@@ -39,6 +65,7 @@ npx expo prebuild --clean
 ### Configure for iOS (Bare React Native project only)
 
 Run `npx pod-install` after installing the npm package.
+
 ## Add missing permissions for iOS
 
 Add following key to plugins of `app.json` in Expo project
@@ -68,6 +95,7 @@ For Bare React Native project, you need to add these key to `Info.plist` in `ios
 ## Usage
 
 Register some listeners
+
 ```
   import * as ExpoStt from 'expo-stt';
 
@@ -82,10 +110,6 @@ Register some listeners
       setSpokenText(value.join());
     });
 
-    const onSpeechCancelled = ExpoStt.addOnSpeechCancelledListener(() => {
-      setRecognizing(false);
-    });
-
     const onSpeechError = ExpoStt.addOnSpeechErrorListener(({ cause }) => {
       setError(cause);
       setRecognizing(false);
@@ -98,7 +122,6 @@ Register some listeners
     return () => {
       onSpeechStart.remove();
       onSpeechResult.remove();
-      onSpeechCancelled.remove();
       onSpeechError.remove();
       onSpeechEnd.remove();
     };
@@ -107,21 +130,14 @@ Register some listeners
 
 There are some functions available to call such as:
 
-* ExpoStt.startSpeech()
-* ExpoStt.stopSpeech()
-* ExpoStt.cancelSpeech()
-* ExpoStt.destroySpeech()
-* ExpoStt.requestRecognitionPermission()
-* ExpoStt.checkRecognitionPermission()
+- ExpoStt.startSpeech()
+- ExpoStt.stopSpeech()
+- ExpoStt.destroySpeech()
+- ExpoStt.requestRecognitionPermission()
+- ExpoStt.checkRecognitionPermission()
 
 Take a look into `example/App.tsx` for completed example
 
 # Contributing
 
-Contributions are very welcome! Please refer to guidelines described in the [contributing guide]( https://github.com/expo/expo#contributing).
-
-## Author
-
-I am looking for a job as a React native developer, remote work is preferred.
-
-Check out my CV: https://anhtuank7c.github.io
+Contributions are very welcome! Please refer to guidelines described in the [contributing guide](https://github.com/expo/expo#contributing).
diff --git a/android/src/main/AndroidManifest.xml b/android/src/main/AndroidManifest.xml
@@ -1,2 +1,8 @@
-<manifest package="expo.modules.stt">
-</manifest>
+<manifest 
+  xmlns:android="http://schemas.android.com/apk/res/android"
+  package="expo.modules.stt"
+  xmlns:tools="http://schemas.android.com/tools">
+    <uses-sdk tools:overrideLibrary="com.facebook.react" />
+    <uses-permission android:name="android.permission.RECORD_AUDIO" />
+    <uses-permission android:name="android.permission.INTERNET" />
+</manifest>
diff --git a/android/src/main/java/expo/modules/stt/ExpoSttModule.kt b/android/src/main/java/expo/modules/stt/ExpoSttModule.kt
@@ -1,41 +1,43 @@
 package expo.modules.stt
 
 import android.util.Log
-import expo.modules.kotlin.activityresult.AppContextActivityResultLauncher
 import expo.modules.kotlin.modules.Module
 import expo.modules.kotlin.modules.ModuleDefinition
 import kotlinx.coroutines.CoroutineScope
 import kotlinx.coroutines.Dispatchers
 import kotlinx.coroutines.launch
 import java.util.*
+import android.speech.SpeechRecognizer
+import android.speech.RecognizerIntent
+import android.content.Intent
+import android.speech.RecognitionListener
+import android.os.Bundle
+import expo.modules.kotlin.exception.CodedException
+import android.Manifest
+import android.content.pm.PackageManager
+import androidx.core.app.ActivityCompat
+import androidx.core.content.ContextCompat
 
-class ExpoSttModule : Module() {
-    private lateinit var voiceRecognizer: AppContextActivityResultLauncher<VoiceRecognizerContractOptions, VoiceRecognizerContractResult>
+class ExpoSttModule : Module(), RecognitionListener {
     private var isRecognizing: Boolean = false
+    private var speech: SpeechRecognizer? = null
 
     companion object {
-        const val onSpeechResult = "onSpeechResult"
-        const val onSpeechError = "onSpeechError"
-        const val onSpeechCancelled = "onSpeechCancelled"
         const val onSpeechStart = "onSpeechStart"
+        const val onSpeechResult = "onSpeechResult"
+        const val onPartialResults = "onPartialResults"
         const val onSpeechEnd = "onSpeechEnd"
+        const val onSpeechError = "onSpeechError"
         const val TAG = "ExpoStt"
     }
 
     override fun definition() = ModuleDefinition {
         Name(TAG)
 
-        /**
-         * We don't need any permission for Recognizer on Android
-         * Just act like iOS one to unify the APIs
-         */
+        Events(onSpeechStart, onSpeechResult, onPartialResults, onSpeechEnd, onSpeechError)
+
         AsyncFunction("requestRecognitionPermission") {
-            return@AsyncFunction mapOf(
-                "status" to "granted",
-                "expires" to "never",
-                "granted" to true,
-                "canAskAgain" to true
-            )
+            requestRecognitionPermission()
         }
 
         AsyncFunction("checkRecognitionPermission") {
@@ -48,30 +50,28 @@ class ExpoSttModule : Module() {
         }
 
         Function("startSpeech") {
+            if (!isPermissionGranted()) {
+                requestRecognitionPermission()
+                return@Function false
+            }
+
             if (isRecognizing) {
                 sendEvent(onSpeechError, mapOf("cause" to "Speech recognition already started!"))
                 return@Function false
             }
-
-            isRecognizing = true
-            sendEvent(onSpeechStart)
-            val options = VoiceRecognizerContractOptions(Locale.getDefault())
+
+            if (speech != null) {
+                speech?.destroy()
+                speech = null
+            }
+
             CoroutineScope(Dispatchers.Main).launch {
-                when (val result = voiceRecognizer.launch(options)) {
-                    is VoiceRecognizerContractResult.Success -> {
-                        isRecognizing = false
-                        sendEvent(onSpeechResult, mapOf("value" to result.value))
-                        sendEvent(onSpeechEnd)
-                    }
-                    is VoiceRecognizerContractResult.Cancelled -> {
-                        isRecognizing = false
-                        sendEvent(onSpeechCancelled)
-                    }
-                    is VoiceRecognizerContractResult.Error -> {
-                        isRecognizing = false
-                        sendEvent(onSpeechError, mapOf("cause" to result.cause))
-                    }
-                }
+                speech = SpeechRecognizer.createSpeechRecognizer(appContext.reactContext)
+                speech?.setRecognitionListener(this@ExpoSttModule)
+                val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
+                intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
+
+                speech?.startListening(intent)
             }
             return@Function true
         }
@@ -82,24 +82,112 @@ class ExpoSttModule : Module() {
          * A bit different compare to iOS APIs
          */
         Function("stopSpeech") {
+            if (speech != null) {
+                speech?.stopListening()
+            }
+            isRecognizing = false
             Log.d(TAG, "Stop Voice Recognizer")
-        }
 
-        Function("cancelSpeech") {
-            Log.d(TAG, "Cancel Voice Recognizer")
+            return@Function null as Any
         }
 
         Function("destroySpeech") {
+            if (speech != null) {
+                speech?.destroy()
+            }
+            isRecognizing = false
             Log.d(TAG, "Destroy Voice Recognizer")
+
+            return@Function null as Any
+        }
+
+    }
+
+    private fun requestRecognitionPermission() {
+        val currentActivity = appContext.currentActivity ?: throw CodedException("Activity is null")
+        val permission = Manifest.permission.RECORD_AUDIO
+        val isGranted = ContextCompat.checkSelfPermission(currentActivity, permission) == PackageManager.PERMISSION_GRANTED
+
+        if (!isGranted) {
+            ActivityCompat.requestPermissions(currentActivity, arrayOf(permission), 1)
+        }
+
+        mapOf(
+                "status" to if (isGranted) "granted" else "denied",
+                "expires" to "never",
+                "granted" to isGranted,
+                "canAskAgain" to true
+        )
+    }
+
+    private fun isPermissionGranted(): Boolean {
+        val permission = Manifest.permission.RECORD_AUDIO
+        val res = appContext.reactContext?.checkCallingOrSelfPermission(permission)
+        return res == PackageManager.PERMISSION_GRANTED
+    }
+
+    override fun onReadyForSpeech(params: Bundle?) {
+        Log.d(TAG, "onReadyForSpeech")
+    }
+
+    override fun onBeginningOfSpeech() {
+        isRecognizing = true
+        sendEvent(onSpeechStart)
+        Log.d(TAG, "onBeginningOfSpeech")
+    }
+
+    override fun onResults(results: Bundle?) {
+        isRecognizing = false
+        val matches = results?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
+
+        if (matches != null) {
+            sendEvent(onSpeechResult, mapOf("results" to matches))
+        } else {
+            sendEvent(onSpeechError, mapOf("errorMessage" to "No speech results"))
         }
 
-        RegisterActivityContracts {
-            voiceRecognizer =
-                registerForActivityResult(VoiceRecognizerContract()) { _, _ ->
-                    Log.d(TAG, "handleResultUponActivityDestruction")
-                }
+        Log.d(TAG, "onResults $matches")
+    }
+
+    override fun onPartialResults(partialResults: Bundle?) {
+        val matches = partialResults?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
+        // Log.d(TAG, "onPartialResults $matches")
+    }
+
+    override fun onEndOfSpeech() {
+        isRecognizing = false
+        sendEvent(onSpeechEnd)
+        Log.d(TAG, "onEndOfSpeech")
+    }
+
+    override fun onError(error: Int) {
+        isRecognizing = false
+        val errorMessage = when (error) {
+            SpeechRecognizer.ERROR_AUDIO -> "Audio recording error"
+            SpeechRecognizer.ERROR_CLIENT -> "Client side error"
+            SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS -> "Insufficient permissions"
+            SpeechRecognizer.ERROR_NETWORK -> "Network error"
+            SpeechRecognizer.ERROR_NETWORK_TIMEOUT -> "Network timeout"
+            SpeechRecognizer.ERROR_NO_MATCH -> "No match"
+            SpeechRecognizer.ERROR_RECOGNIZER_BUSY -> "RecognitionService busy"
+            SpeechRecognizer.ERROR_SERVER -> "Error from server"
+            SpeechRecognizer.ERROR_SPEECH_TIMEOUT -> "No speech input"
+            else -> "Unknown error"
         }
+
+        sendEvent(onSpeechError, mapOf("errorMessage" to errorMessage))
+        Log.d(TAG, "onError: $error $errorMessage")
+    }
+
+    override fun onBufferReceived(buffer: ByteArray?) {
+        Log.d(TAG, "onBufferReceived")
+    }
+
+    override fun onRmsChanged(rmsdB: Float) {
+    //    Log.d(TAG, "onRmsChanged: $rmsdB")
+    }
 
-        Events(onSpeechResult, onSpeechError, onSpeechEnd, onSpeechStart, onSpeechCancelled)
+    override fun onEvent(eventType: Int, params: Bundle?) {
+        Log.d(TAG, "onEvent: $eventType")
     }
 }
diff --git a/build/ExpoStt.types.d.ts b/build/ExpoStt.types.d.ts
diff --git a/build/ExpoStt.types.d.ts.map b/build/ExpoStt.types.d.ts.map
-Original file line number
+Diff line change
@@ Expand Up / @@ -29,6 +29,7 @@ project.xcworkspace @@
     # Android/IJ
     #
+    build/
     .classpath
     .cxx
     .gradle
@@ Expand Down @@