-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to get it to work with Voco #42
Comments
What is the samplerate of voco? And this software does not support snips anymore, which version of snips does voco use? I can see if I can get it to work with voco, but I won't put that in master due to the fact that snips is not supported. |
That would rock! I was looking into the header to see if I had to change something there. As far as I know, Voco uses the latest version of Snips. Does Rhasspy use different audio headers from that last version of Snips? The mystery is: why does it sometimes actually work? |
Where can I find this "VoiceActivity" logs on such? The audio produced is just a lot of small wave files. Rhasspy does not differ from snips with regards to that. This is the wave format: http://soundfile.sapp.org/doc/WaveFormat/ The 4 and , are not a 4 and a , but represent the 4 bytes of the ChunkSize. |
I will use this branch https://github.com/Romkabouter/ESP32-Rhasspy-Satellite/tree/voco |
Ah, interesting. I figured it might matter. I also wonder if the To debug, I use these commands: Snips Watch: Looking at ALL MQTT traffic: And if you need to quickly restart the gateway for some reason: |
Yes, when I see this: In the streamer I use a fixed header, because every wav audio send is the same length and format :) |
Yes I've only seen that
0_0 :-) |
AI, that doesn't sound good... Is that with a different wav encoding? Are you using the 6.0 version? |
No, this is Snips installed on a Pi with an attached mic. Nothing to do with this code. That is why I installed the demo assistant. As you can see, the hotword is detected ok, I also tested the mic outside snips. All works. ok edit, no ASR is running ;) |
Keeping my fingers crossed over here :-) |
I followed this step and now snips is working :) |
Hey wow!! Great work! What's still left to do? It looks to me like a 100% succes? |
This was a test with a Pi with a build-in mike. The next step for me is to check and adjust the code for the streamer to get Snips going (again). |
Ah I see. Are you sure the effort is worth it? We could just wait for Voco to move to Rhasspy. That has to happen at some point anyway. |
Well, it would be nice to find what the issue is, but that is just me :D I see in the messages RIFF4, as you also already found. The bytes send per message is also 572 instead of the 556 that the streamer sends. |
That's what I figured as well. |
Sorry about my lack of commitment, my sd card died so I had to start over which I did not have time for. |
No worries. Pretty busy overhere as well :-) |
I have the code working with snips, however there are some header changes which I can't get right atm. |
That's great news! I'm going to check this out asap! |
Where can I find the code? I had a look at the voco branch, but that doesn't seem to be it? |
Just pushed it. I have also included a record.py which records a stream for a couple of seconds. Just pushed branch works with met Atom Echo, but the recordlevel seems to be very low. This will also be the case with the master branch I assume |
Cool, I will try it now! |
It took a bit of work to be able to upload it via the Arduino IDE again. I had to strip out the LED parts. Now that it uploads, I get this error. Nothing to worry about, I just have to look into it.
|
Got a bit further. Should I change some of the settings to get it to continuously stream audio? I'm assuming I should not use hotward detection. The
|
Gain is actualy only used in the Matrix Voice I think. Expect unexpected results! |
Good news: I managed to get it to detect a hotword by shouting very loudly. I'm looking closer at how the back-and-forth with Snips is going. After it detects the hotword, the ASR doesn't receive audio (timeout).
|
There is a doubling going on again it seems.
|
These are some messages going to the ASR:
The StartSignalMS seems to be a strange value: -20. Maybe that's because the time data isn't in the audio stream? |
As I do not know what your code looks like, I do not know where the doubling occurs. Is your asr listening? Depens on the snips.toml file I believe |
The ASR does work for other satellites in the house, which are based on Voco/Snips. Perhaps they are sending an extra message. The latest Arduino code can be found here: https://github.com/flatsiedatsie/voco_mini_sat |
Ok, checking your code.
|
Sharp eyes -) I was trying to stop and then restart the ASR, hoping that would fix the issue. But then I tried skipping the HotwordDetected state alltogether. So currently that code is never called. All the HotwordDetected state did, was to stop the stream and restart it, which I suspected wasn't needed if there wasn't on-board hotword detection being done. I've only removed the wifi password :-) Just in case you'd like to try uploading via the arduino IDE yourself:
(maybe restart the IDE) Then under
I believe I've managed to remove the double call of MMQTTDisconnected state. The The strange thing is that the ASR stops responding for the entire system if I use the AtomEcho. The ASR also stops responding to the main microphone, although it still does hotword detection fine. A session is also created just fine. Sometimes the ASR stops working alltogether, and sometimes it will work 50%, intermittently: after a succesfull run it will not respond the next time, until it times out, and then start responding again after that, and so forth. This seems to only happens if the AtomEcho is on the network. The AtomEcho also seems to go into reboot loops. I'm not sure how that's even possible. It's as if it remembers that the previous time it booted up, it failed, and will continue to do so until I unplug it, and then plug it in again. |
Just saw another strange situation where I disconnected the AtomEcho, and then the ASR started only listening for 1 second on the main microphone.
After that it reverted to the intermittent "ASR listens, ASR is deaf" situation. |
I've tried to manually run the ASR and check it's output. Here's what happens with a "normal" call from Voco:
And this is all that happens with the AtomEcho:
|
It also initializes the wave header and updates the led status. I recommend not to fiddle with the status too much.
What is this azrxidia I see in all your messages? Can you try to stop that stream? |
I'd be happy to. Here's the I've also stripped out the LED parts (there was an error I couldn't fix, so I just stripped it out completely). I've also removed the OTA updates, since that won't be needed either and I figured it might leave more memory. I've re-enabled the HotwordDetected state, but the result is the same. I'll update the code on github. |
If you remove the methods updateColors(int colors) and updateBrightness(int brightness) in your device ocde, then nothing will be done :) I think you need to set this for the AudioServer:
That is so that the audioserver actually listens to all audio streams. |
I'll give it a go. I could also add it to I've also added a feature to Voco so that it can provide the current time through an MQTT request. I wanted to experiment with sending the timestamp in the wav header. |
Something else I'm curious about: would it be possible to have the AtomEcho connect based on hostname instead of IP address? I seem to see some hints in the settings this might be possible? if so, then the main controller could infuse that hostname into the AtomEcho at the moment of uploading the code. |
Might be a good idea, than you should have it set for all sections |
It already does if you pust a hostname instead of an IP |
Hi @flatsiedatsie, We have come a long way since any activity here. If you require some help from me, please give me a shout. Otherwise I will close this issue at some point in the future. |
@flatsiedatsie it seems Voco is not available anymore as Addon, is that correct? I just cannot find it in the Addon in Webthings. |
Voco is only available on the Raspberry Pi. I spent considerable time on it last time, but unfortunately couldn't get the audio to be coherent enough. Unfortunately in the end I couldn't spend that much time on a 'nice to have' anymore :-( |
Ah ok, that is probably the issue then. I have a Raspberry Pi available now, do you still want me to put some effort in it? |
I still find this interesting, so I have installed WebThings and could now indeed install voco. |
Sure, that would be wonderful! If you live in Amsterdam I can supply you with a good USB mic if you want :-) |
I've uploaded the latest version of the code I was working on here: It would be great if you could try this Arduino workflow (Arduino IDE), because if that works, then it will be possible too flash the code to user devices via the Candle Manager addon for the Webthings Gateway. |
hehe, nope. Some good 200km drive north. But I got one :) |
I have installed WebThing and VoCo on a Pi. When I type "tell me the time", I expected to have audio output. The correct text appears. Is my expectation incorrect? I have set the output to headphone. speaker-test works |
ok, apparently I was expecting that incorrect. I got voco running on a Pi now and it is working :) |
I thought the issue might be caused by the low energy from the M5 so I tried my matrixvoice. I get this error:
So I think it boils down to the audio again. Snips has some extra headers, it might be that this is causing that. I'll see if I can fix it |
Yeah those headers, those indeed seem to be the issue. Glad Voco is working :-) text commands only give text output (designed for quiet operation when kids are sleeping). Voice commands give voice output. |
This is a continuation of discussion here.
I've managed to get Snips to recognise the wake-word, but only right after booting the Atom Echo.
Snips does recognise that there is audio input.
While in idle mode, Snips Watch indicates that audio is being heard.
If I press the button to start a session, a session is created, and the dialogue manager listens to the stream from the Atom Echo. But the voice input is not recognised as a voice command:
If I don't speak into the Raspberry Pi version, then things look a bit different.
So what would support the idea that some MQTT message is missing.
As an aside, I also noticed that the wave header is slightly different:
ESP32
USB microphone on Raspberry Pi:
I also check the output of the various commands in Mosquitto to find out which exact message was missing.
----Atom Echo button----
----Atom Echo hotword detected----
The text was updated successfully, but these errors were encountered: