Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two-way audio discussion. #738

Open
hjdhjd opened this issue Aug 18, 2020 · 50 comments
Open

Two-way audio discussion. #738

hjdhjd opened this issue Aug 18, 2020 · 50 comments

Comments

@hjdhjd
Copy link

hjdhjd commented Aug 18, 2020

@Sunoo to continue the discussion on two-way audio from elsewhere:

With respect to the topic of two-way audio we started...I think the way you get it back to the camera is ultimately going to be camera-specific. For example, I have two plugins I actively work on - homebridge-unifi-protect2 and homebridge-doorbird. Both have two different ways of taking audio back.

The way I’d suggest solving this in a general way is to not worry about it and make it the user’s problem to a degree. Use a callback or function pointer to call a custom function that has the job of sending that audio out to it’s final destination.

Perhaps create an extension system where people can contribute camera/manufacturer-specific two-way “send” functions...and then you can enable support in a consistent, generic way.

Those of us that look to your codebase as a starting point can leverage that as well to complete our own implementations. Thoughts?

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 18, 2020

I’d just have to think of how that would actually work. I have done some things with helper plugins to add support for FTP and SMTP motion alerts, perhaps I could come up with some way of doing a similar thing for audio return.

I’m a little reluctant to add just dead code that’s only used by people who fork the plugin, but that would obviously be the easiest path forward. I’m just having a little trouble envisioning a reasonable way for two plugins to talk to each other to accomplish this.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 18, 2020

@hjdhjd Oh, and semi-related, but me and a bunch of other camera plugin developers are on the Homebridge Discord, if you ever wanted to talk there instead.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 18, 2020

Happy to. Point me to it? We can pick this up there.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 18, 2020

@llemtt
Copy link

llemtt commented Aug 19, 2020

HomeKit manages two-way audio like an rtsp "backchannel", ffmpeg won't support it easily because "out of the box" it handles many input streams but just one output! Changing this behavior requires a deep rewriting of the ffmpeg run loop and overall architecture, so in the end it won't be ffmpeg anymore but a new program..

I successfully run two-way audio with an udp proxy that "splits" the rtcp and backchannel traffic and another ffmpeg to handle the latter like homebridge-doorbird does.

Unfortunately I hate TypeScript while I love JavaScript beeing an "untyped" true scripting language (do you know Lisp?) so I won't contribute to TypeScript plugins sorry.

I also started to think that, given the increased complexity of the task, it would be better to build cameras bridges using the Apple open source HK framework.

cheers

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 19, 2020

The easy way to handle that is just to run a second instance of FFmpeg to handle the return audio.

The more complex issue I’m thinking through is how to handle actually sending audio back at the camera. I’ve basically come to the conclusion that I’ll have to pass the audio off to another Homebridge plugin that can handle actually sending it on to the camera.

@llemtt
Copy link

llemtt commented Aug 19, 2020

I’ve basically come to the conclusion that I’ll have to pass the audio off to another Homebridge plugin that can handle actually sending it on to the camera.

Why do you think that? I actually use the output of the return "ffmpeg" instance.

BTW you can't use the rtsp backchannel if your camera support it (onvif requires that for two-way audio), you have to use some other separate api. (the rtsp backchannel require negotiation/setup/control to be done in the context of the rtsp session that is actually managed by the "main" ffmpeg instance)

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 19, 2020

Because as far as I can tell, there is no standard for sending audio back to cameras. Every camera seems to do its own thing. Handing that implementation off to a camera-specific plugin seems like the best approach to me?

Or am I mistaken and there is a standard, or small set of standards, that I could have a hope of implementing? I don’t actually own any cameras that have two way audio support at the moment, so I can’t say for sure.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 19, 2020

There are lots of different (often proprietary) ways to get audio back. Just look at Ring, Nest, Doorbird, and UniFi Protect Doorbell (I'm develop on the last two)...they all have different mechanisms for getting audio back.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 19, 2020

@llemtt I saw your note above...do you have an example some of us (i.e. me 😄) can use as at least a reference point as we tackle two-way audio in our respective plugins? I get TS isn't your jam. And yeah...I know lisp...and elisp...and that's taking me back aways. 😄

Specifically...what're the ffmpeg (or other tools) and the respective command lines you used to execute a UDP proxy?

Right now, I'm playing with just plain trying to get audio streamed via UDP to a device. It doesn't seem to accept RTSP, it looks more like just pure AAC over UDP. Thoughts?

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 19, 2020

@hjdhjd You can see a good example of that in the Ring plugin with the RTPSplitter dgreif wrote for that.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 19, 2020

@Sunoo I've looked at it...I'm concerned about timing issues with that particular approach, but it's a start for sure.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 19, 2020

I don’t know how you’d do it otherwise, honestly.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 19, 2020

That's what makes these things so fun... 😄

@llemtt
Copy link

llemtt commented Aug 19, 2020

ffmpeg.js.txt

attached is my version of ffmpeg.js I currently use inside the homebridge-videodoorbell plugin

Ring plugin do almost exactly what I did (I even copied their fdk-aac codec configuration because it works better than mine..)

There was also a proxy rtp implementation inside hap-nodejs, but I never understood how to get it working.

@llemtt
Copy link

llemtt commented Aug 19, 2020

Right now, I'm playing with just plain trying to get audio streamed via UDP to a device. It doesn't seem to accept RTSP, it looks more like just pure AAC over UDP. Thoughts?

Can you already stream an audio (file) with ffmpeg to that device?

@hjdhjd
Copy link
Author

hjdhjd commented Aug 19, 2020

@llemtt Thanks I'll take a look in a bit. As to streaming via ffmpeg from the command line...that's the step I'm currently battling with at the moment. Step 1 is to be able to get anything to output out of the damn thing period. As I said...it looks like it takes AAC over UDP...just trying to figure out how to send it without encapsulating it in a transport protocol like mpegts or others...thoughts?

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 20, 2020

I spend a chunk of today doing some digging into this topic, and it seems like there are two main methods of sending audio back to a camera: VAPIX's HTTP POST-based method, and ONVIF's RTSP audio backchannel.

Both methods look possible to support using FFmpeg, so I plan to drop the extension plugin idea (at least for now) and target both of those to begin with. Just looking at standards, VAPIX looks easier, however based on what I've read today, RTSP backchannel looks to be possible in most if not all cases without technically doing the ONVIF negotiation. I believe I should be able to implement actual ONVIF support if needed though, but that wouldn't be part of the initial two-way audio version.

I'm trying to track down a cheap camera that supports one or both of these methods to use to develop against. It looks like the cheapest reasonable camera will likely be a second-hand AXIS camera which supports VAPIX, I just need to track one down on eBay or similar. Unfortunately, ONVIF profile T cameras (the profile that supports two-way audio) seem to be much more expensive and harder to identify, so I doubt I'll end up getting my hands on one,

@hjdhjd
Copy link
Author

hjdhjd commented Aug 20, 2020

It's a simpler, if less flexible, approach. I'll be watching eagerly.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 20, 2020

Yea, but it should be a decent starting place at least. I may revisit the idea of kicking the audio over to other extensions in the future, but I'm wondering if those scenarios make more sense for someone to just fork this plugin at that point.

@hjdhjd
Copy link
Author

hjdhjd commented Aug 20, 2020

Totally agree. This is going to be an iterative process, no doubt. Let's start somewhere. Eager to see the next step one once you find a camera or two...

@llemtt
Copy link

llemtt commented Aug 20, 2020

I agree too, one-way audio cameras, two-way audio (surveillance) cameras and video doorbells are different products that are meant to support very different use cases although they share 99% of the technology, so one plugin to control them all maybe it's not a good idea..

Cameras with two-way audio are few and usually expensive, and some of them don't even incorporate a speaker nor an amplifier (just line level out) which means you must buy and install additional hardware. It makes a lot of sense to me to assess what's actually on the market before going ahead and decide what eventually support, although the solution I implemented is "configurable" and works with whatever camera or device you can send audio to using ffmpeg. (e.g. raspy diy camera)

VAPIX and other HTTP POST based devices are the easiest to work with indeed!

@longzheng
Copy link
Contributor

longzheng commented Aug 25, 2020

I use this plugin for my 2N doorbell (among other IP cameras). Two-way audio support interests me greatly because currently I have to use/host a 3CX SIP server and run the 3CX app on my phone for doorbell functionality to work.

I'm not sure if it covers all use cases but I believe most IP-based doorbells (e.g. Ring, DoorBird) support the SIP protocol for two-way audio communication. I wonder/suggest if supporting SIP natively would cover most of the use cases?

I have a quick look into how the homebridge-ring plugin works and it's my understanding it basically initiates a SIP call with the doorbell (it is my understanding all SIP compatible devices can act as both SIP servers and SIP clients) with the audio encoded by ffmpeg and packaged as SRTP. If that could be standardized that would be amazing.

@llemtt
Copy link

llemtt commented Aug 25, 2020

@longzheng If you want your doorbell "ring" your phone you have to use SIP or something similar (Facetime? WZP?) because HomeKit can only trigger a notification that barely emits a single "ping" I can never hear.

I do that in my plugin using linphone, just to ring the phone then I get into the homekit camera to talk. If homekit support of videodoorbell doesn't improve I'll move back to a SIP-like solution.

I considered also buying the 2N, but it looked to much expensive and diy it's more funny!

@longzheng
Copy link
Contributor

longzheng commented Aug 25, 2020

@longzheng If you want your doorbell "ring" your phone you have to use SIP or something similar (Facetime? WZP?) because HomeKit can only trigger a notification that barely emits a single "ping" I can never hear.

I do that in my plugin using linphone, just to ring the phone then I get into the homekit camera to talk. If homekit support of videodoorbell doesn't improve I'll move back to a SIP-like solution.

I considered also buying the 2N, but it looked to much expensive and diy it's more funny!

Right, I've already set up this plugin's doorbell feature using a HTTP trigger (the 2N doorbell has UI to configure HTTP triggers and events).

I do see the doorbell notifications but I admit because I've got the 3CX SIP call at the same time, I don't know if I would miss HomeKit-only doorbell notifications.

What do you use to "ring" the phone? VOIP or something like Twilio?

@llemtt
Copy link

llemtt commented Aug 25, 2020

What do you use to "ring" the phone? VOIP or something like Twilio?

I have a SIP account on the linphone (free) registar and I use the linphonec CLI client on the raspi to issue the call.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 25, 2020

@longzheng I’m gonna be honest, I don’t expect that I’ll be adding SIP support to this plugin, that sounds like the kind of thing best suited for a fork specific to that device. I’m just not sure that SIP is common enough of a return audio method, and it would add a decent amount of complexity.

If you have any documentation on how your doorbell works, I’ll take a look though.

@longzheng
Copy link
Contributor

@Sunoo No worries, appreciate the heads up.

The actual standard/protocol is called "SIP Direct Call", which allows you to make a connection to the device without a SIP server/proxy (as SIP is normally set up). Basically I can use a standard SIP client/app and point it to the IP of the doorbell, and initiating a SIP call will just work. Some info about it here https://stackoverflow.com/questions/8516133/how-can-i-make-call-between-direct-ip-to-ip-without-sip-server

There's not much documentation about it on my doorbell's manufacturer's website except to say it works https://wiki.2n.cz/hip/inte/latest/en/1-pbx/direct-call

homebridge-ring seems to use this SIP Direct Call behaviour in their plugin for the two-way audio functionality dgreif/ring@0bdb154

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 26, 2020

Yea, I’ve talked to dgreif quite a bit about two-way in general (and also use the Ring plugin myself), and have been glad that I’ve managed to avoid dealing with SIP myself. :P

Maybe once I get easier methods done I’ll jump into SIP, but without a device to test against, it’ll be very hard to be sure I’ve got anything right.

@longzheng
Copy link
Contributor

Yea, I’ve talked to dgreif quite a bit about two-way in general (and also use the Ring plugin myself), and have been glad that I’ve managed to avoid dealing with SIP myself. :P

Yeah I appreciate what you mean, I took a look at the SIP code as well and it is pretty complex to understand.

Maybe once I get easier methods done I’ll jump into SIP, but without a device to test against, it’ll be very hard to be sure I’ve got anything right.

So it is my understanding a lot of SIP clients/apps also support SIP direct calling behaviour, so in theory you should be able to install a SIP app on a Mac/Windows/iOS/Android device and then "call" that device using the IP.

For example this is a list of "softphones" that my doorbell claims to work via Direct Call as well (in this scenario the intercom would be direct calling the softphone via IP) https://wiki.2n.cz/hip/inte/latest/en/3-softphones I'd imagine for testing you could use any of these softphones as well to simulate as an intercom.

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 26, 2020

I was assuming that just because a client could make a SIP direct call didn’t mean it could receive a SIP direct call. Maybe that’s not valid. I honestly have not done much with SIP just overall.

@longzheng
Copy link
Contributor

longzheng commented Aug 26, 2020

I was assuming that just because a client could make a SIP direct call didn’t mean it could receive a SIP direct call. Maybe that’s not valid. I honestly have not done much with SIP just overall.

It is my (rough) understanding from the SIP standard that any client can both send and receive calls directly, hence why "direct" just works across all these SIP devices and SIP apps.

Some reference https://www.3cx.com/docs/direct-sip/

@Sunoo
Copy link
Collaborator

Sunoo commented Aug 26, 2020

Good to know, I’ll keep that in the back of my mind to possibly work on at some point in the future.

@llemtt
Copy link

llemtt commented Aug 26, 2020

I was assuming that just because a client could make a SIP direct call didn’t mean it could receive a SIP direct call. Maybe that’s not valid. I honestly have not done much with SIP just overall.

Receiving whatever audio call on the iPhone requires registration on a "notification" server, otherwise you are going to receive a call, also a SIP direct call, only while your app is running.

@longzheng
Copy link
Contributor

I was assuming that just because a client could make a SIP direct call didn’t mean it could receive a SIP direct call. Maybe that’s not valid. I honestly have not done much with SIP just overall.

Receiving whatever audio call on the iPhone requires registration on a "notification" server, otherwise you are going to receive a call, also a SIP direct call, only while your app is running.

In the context of testing I think that's fine. In the end it's the intention that homebridge would be direct calling the intercom.

@llemtt
Copy link

llemtt commented Aug 26, 2020

In the context of testing I think that's fine. In the end it's the intention that homebridge would be direct calling the intercom.

Sorry but I can't follow you, what is the "intercom"? Can you describe your use case?

@longzheng
Copy link
Contributor

In the context of testing I think that's fine. In the end it's the intention that homebridge would be direct calling the intercom.

Sorry but I can't follow you, what is the "intercom"? Can you describe your use case?

I think we got a few things confused.

In the context of what I hoped this plugin with SIP integrated will help me achieve: I will configure the plugin's two-way audio to point to the SIP URL of my intercom directly (e.g. sip:192.168.1.100) so that when I open/view my intercom's camera in HomeKit, Homebridge will direct SIP call the intercom (and the intercom would automatically accept) so that the intercom speakers will play my phone's microphone.

In the context of helping developers test any SIP integration: I was suggesting people without an intercom handy can use any compatible SIP client/app installed on an iOS, Android, Windows or Mac device. During development, they could set up the SIP URL to the SIP client's IP, emulating what would happen if that was an intercom device.

@arcidodo
Copy link

arcidodo commented Sep 6, 2020

Hi,

i have an Dahua VTO2111D-WP intercom.

i have the stream working with this plugin, and i use home assistant / nodered to trigger the doorbell with this add-on:
https://github.com/elad-bar/Hassio-addons

EDIT: is see that this repository is not existing anymore but the HA addon is based on this:
https://github.com/elad-bar/DahuaVTO2MQTT

is it possible to get this device working with the Two-way audio feature?

device information:
https://dahuawiki.com/Video_Intercom/Products/DHI_VTO2111D_WP

@Sunoo
Copy link
Collaborator

Sunoo commented Sep 6, 2020

@arcidodo If it supports Dahua’s HTTP API then yes. I’ve ordered a device with a similar API, so once it arrives, I should be able to provide a config to get you started.

@arcidodo
Copy link

arcidodo commented Sep 6, 2020

oh that sounds really nice:-)

i found an PDF that explains the http API of Dahua

https://www.planetseguridad.com/descargas/DAHUA_HTTP_API_FOR_DVR_V1.29.pdf

Edit: I read the document and I found that on page 76 they describe how to post audio. Hope this helps!

@arcidodo
Copy link

arcidodo commented Sep 7, 2020

[9/7/2020, 4:09:11 PM] Homebridge is running on port 51826. [9/7/2020, 4:09:28 PM] [Camera-ffmpeg] [Deurbel Camera] Starting video stream: 1280 x 720, 30 fps, 299 kbps [9/7/2020, 4:09:28 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] Stream command: /usr/local/lib/node_modules/homebridge-camera-ffmpeg/node_modules/ffmpeg-for-homebridge/ffmpeg -hide_banner -protocol_whitelist pipe,udp,rtp,file,crypto -f sdp -c:a libfdk_aac -i pipe: -probesize 32 -analyzeduration 32 -c:a pcm_mulaw -ab 128k -ac 1 -ar 16000 -f wav -chunked_post 0 -content_type Audio/MPEG2 http://----snip---:[email protected]/cgi-bin/audio.cgi?action=postAudio&channel=1&httptype=singlepart -loglevel level+verbose [9/7/2020, 4:09:28 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [sdp @ 0x555835bdbbc0] [verbose] setting jitter buffer size to 500 [9/7/2020, 4:09:28 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [info] Input #0, sdp, from 'pipe:': [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [info] Metadata: [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [info] title : Talk [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [info] Duration: N/A, start: [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] 0.000000, bitrate: N/A [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [info] Stream #0:0: Audio: aac, 16000 Hz, mono, s16 [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [tcp @ 0x555835c6a380] [verbose] Starting connection attempt to 10.0.102.222 port 80 [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [tcp @ 0x555835c6a380] [verbose] Successfully connected to 10.0.102.222 port 80 [9/7/2020, 4:09:34 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] [9/7/2020, 4:09:40 PM] [Camera-ffmpeg] [Deurbel Camera] Stopped video stream. [9/7/2020, 4:09:40 PM] [Camera-ffmpeg] [Deurbel Camera] [Two-way] FFmpeg exited with code: null and signal: SIGKILL (Expected)

config.json
"returnAudioTarget": "-probesize 32 -analyzeduration 32 -c:a pcm_mulaw -ab 128k -ac 1 -ar 16000 -f wav -chunked_post 0 -content_type Audio/MPEG2 http://----snip---:[email protected]/cgi-bin/audio.cgi?action=postAudio&channel=1&httptype=singlepart",

so far so good, the camera doesn't crash if i open the stream and enable the Mic, but i don't hear anything at the doorbell ;-) so i have some config wrong can someone help me with this? i played with some mime-types but still nothing.

if i use the app "talend api tester" and i post an audio file to it the doorbell played some weird sound.

Schermafbeelding 2020-09-07 om 16 12 49

@NaterDawg
Copy link

I spend a chunk of today doing some digging into this topic, and it seems like there are two main methods of sending audio back to a camera: VAPIX's HTTP POST-based method, and ONVIF's RTSP audio backchannel.

Both methods look possible to support using FFmpeg, so I plan to drop the extension plugin idea (at least for now) and target both of those to begin with. Just looking at standards, VAPIX looks easier, however based on what I've read today, RTSP backchannel looks to be possible in most if not all cases without technically doing the ONVIF negotiation. I believe I should be able to implement actual ONVIF support if needed though, but that wouldn't be part of the initial two-way audio version.

I'm trying to track down a cheap camera that supports one or both of these methods to use to develop against. It looks like the cheapest reasonable camera will likely be a second-hand AXIS camera which supports VAPIX, I just need to track one down on eBay or similar. Unfortunately, ONVIF profile T cameras (the profile that supports two-way audio) seem to be much more expensive and harder to identify, so I doubt I'll end up getting my hands on one,

I'm not sure if you have found a cheap camera, but I picked one of these up and have been using it with this plug-in.
https://www.amazon.com/Security-Surveillance-Waterproof-Detection-Deterrent/dp/B07Z3BZF35/ref=asc_df_B07Z3BZF35/?tag=bingshoppinga-20&linkCode=df0&hvadid=&hvpos=&hvnetw=o&hvrand=&hvpone=&hvptwo=&hvqmt=e&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=&hvtargid=pla-4583520389418278&psc=1

It supports ONVIF, RTSP has two-way audio through 3rd party app.

@SEVSM
Copy link

SEVSM commented Apr 23, 2021

Hello. Please, somebody tell me, what should i write in config.json "returnAudioTarget": if in talend api tester audio works with this config?
Скриншот 23-04-2021 104140

@longzheng
Copy link
Contributor

I just want to sanity check if anyone has return audio working with iOS 16?

I've been testing trying to output the return audio just to a test file but I'm seeing the two-way FFmpeg process output nothing because the input stream seems to be empty. I can't quite figure out if I'm doing something wrong or maybe iOS 16 broke/changed something.

                {
                    "name": "Intercom test 2",
                    "doorbell": true,
                    "unbridge": true,
                    "videoConfig": {
                        "source": "-i rtsp://192.168.1.3/h264_stream",
                        "audio": true,
                        "debugReturn": true,
                        "returnAudioTarget": " testoutput.mp3"
                    }
                }
[20/09/2022, 9:29:02 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [sdp @ 0000023463276a00] [verbose] setting jitter buffer size to 500
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [warning] Guessed Channel Layout for Input Stream #0.0 : mono
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Input #0, sdp, from 'pipe:':
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     title           : Talk
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Duration: N/A, bitrate: N/A
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: aac, 16000 Hz, mono, s16
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Stream mapping:
[20/09/2022, 9:29:13 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [error] pipe:: Unknown error
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [graph_0_in_0_0 @ 0000023463848480] [verbose] tb:1/16000 samplefmt:s16 samplerate:16000 chlayout:0x4
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [format_out_0_0 @ 000002346384aa80] [verbose] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [auto_resampler_0 @ 000002346384d140] [verbose] ch:1 chl:mono fmt:s16 r:16000Hz -> ch:1 chl:mono fmt:s16p r:16000Hz
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Output #0, mp3, to 'testoutput.mp3':
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TIT2            : Talk
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TSSE            : Lavf58.38.101
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: mp3 (libmp3lame), 16000 Hz, mono, s16p, delay 1105
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Metadata:
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]       encoder         : Lavc58.70.100 libmp3lame
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose] No more output streams to write to, finishing.
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose] Input file #0 (pipe:):
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose]   Input stream #0:0 (audio): 0 packets read (0 bytes); 0 frames decoded (0 samples); 
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose]   Total: 0 packets (0 bytes) demuxed
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose] Output file #0 (testoutput.mp3):
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose]   Output stream #0:0 (audio): 0 frames encoded (0 samples); 0 packets muxed (0 bytes); 
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [verbose]   Total: 0 packets (0 bytes) muxed
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [warning] Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [AVIOContext @ 000002346327b280] [verbose] Statistics: 0 seeks, 1 writeouts
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [AVIOContext @ 0000023463277500] [verbose] Statistics: 354 bytes read, 0 seeks
[20/09/2022, 9:29:23 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] FFmpeg exited with code: 0 and signal: null (Error)

@longzheng
Copy link
Contributor

longzheng commented Sep 20, 2022

Ah I figured it out, it's to do with the Windows firewall. (My Homebridge is running on Windows 10)

If I disable the private network firewall, then I see audio packets.

[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Input #0, sdp, from 'pipe:':
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     title           : Talk
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Duration: N/A, start: 0.000000, bitrate: N/A
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: aac, 16000 Hz, mono, s16
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Stream mapping:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [graph_0_in_0_0 @ 000002d22d65b780] [verbose] tb:1/16000 samplefmt:s16 samplerate:16000 chlayout:0x4
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [format_out_0_0 @ 000002d22d65ecc0] [verbose] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [auto_resampler_0 @ 000002d22d65eec0] [verbose] ch:1 chl:mono fmt:s16 r:16000Hz -> ch:1 chl:mono fmt:s16p r:16000Hz
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Output #0, mp3, to 'testoutput.mp3':
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TIT2            : Talk
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TSSE            : Lavf58.38.101
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: mp3 (libmp3lame), 16000 Hz, mono, s16p, delay 1105
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]       encoder         : Lavc58.70.100 libmp3lame
[20/09/2022, 9:34:36 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       2kB time=00:00:00.37 bitrate=  35.1kbits/s speed=0.721x    
[20/09/2022, 9:34:36 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       3kB time=00:00:00.91 bitrate=  29.5kbits/s speed=0.896x    
[20/09/2022, 9:34:37 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       5kB time=00:00:01.42 bitrate=  28.1kbits/s speed=0.922x    
[20/09/2022, 9:34:37 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       6kB time=00:00:01.92 bitrate=  27.5kbits/s speed=0.928x    
[20/09/2022, 9:34:38 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       8kB time=00:00:02.43 bitrate=  27.0kbits/s speed=0.944x    
[20/09/2022, 9:34:38 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=      10kB time=00:00:02.94 bitrate=  26.8kbits/s speed=0.949x    
[20/09/2022, 9:34:39 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=      11kB time=00:00:03.44 bitrate=  26.6kbits/s speed=0.958x   

I'm trying to figure out why the Windows Firewall is not allowing the audio return port automatically.

I noticed that the videoReturnPort gets used to bind to a socket https://github.com/Sunoo/homebridge-camera-ffmpeg/blob/master/src/streamingDelegate.ts#L430

Do we need to do something similar to the audioReturnPort @Sunoo?

@MrSco
Copy link

MrSco commented Apr 13, 2023

Ah I figured it out, it's to do with the Windows firewall. (My Homebridge is running on Windows 10)

If I disable the private network firewall, then I see audio packets.

[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Input #0, sdp, from 'pipe:':
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     title           : Talk
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Duration: N/A, start: 0.000000, bitrate: N/A
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: aac, 16000 Hz, mono, s16
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Stream mapping:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [graph_0_in_0_0 @ 000002d22d65b780] [verbose] tb:1/16000 samplefmt:s16 samplerate:16000 chlayout:0x4
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [format_out_0_0 @ 000002d22d65ecc0] [verbose] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [auto_resampler_0 @ 000002d22d65eec0] [verbose] ch:1 chl:mono fmt:s16 r:16000Hz -> ch:1 chl:mono fmt:s16p r:16000Hz
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] Output #0, mp3, to 'testoutput.mp3':
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]   Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TIT2            : Talk
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     TSSE            : Lavf58.38.101
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Stream #0:0: Audio: mp3 (libmp3lame), 16000 Hz, mono, s16p, delay 1105
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]     Metadata:
[20/09/2022, 9:34:35 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info]       encoder         : Lavc58.70.100 libmp3lame
[20/09/2022, 9:34:36 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       2kB time=00:00:00.37 bitrate=  35.1kbits/s speed=0.721x    
[20/09/2022, 9:34:36 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       3kB time=00:00:00.91 bitrate=  29.5kbits/s speed=0.896x    
[20/09/2022, 9:34:37 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       5kB time=00:00:01.42 bitrate=  28.1kbits/s speed=0.922x    
[20/09/2022, 9:34:37 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       6kB time=00:00:01.92 bitrate=  27.5kbits/s speed=0.928x    
[20/09/2022, 9:34:38 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=       8kB time=00:00:02.43 bitrate=  27.0kbits/s speed=0.944x    
[20/09/2022, 9:34:38 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=      10kB time=00:00:02.94 bitrate=  26.8kbits/s speed=0.949x    
[20/09/2022, 9:34:39 pm] [Camera FFmpeg] [Intercom test 2] [Two-way] [info] size=      11kB time=00:00:03.44 bitrate=  26.6kbits/s speed=0.958x   

I'm trying to figure out why the Windows Firewall is not allowing the audio return port automatically.

I noticed that the videoReturnPort gets used to bind to a socket https://github.com/Sunoo/homebridge-camera-ffmpeg/blob/master/src/streamingDelegate.ts#L430

Do we need to do something similar to the audioReturnPort @Sunoo?

Did you ever get playable audio in your testoutput.mp3? I tried this test but the resulting ~/.homebridge/test.mp3 is 0kb. I think it may be permissions?...

[13/04/2023, 10:27:40] [Camera ffmpeg] [FrontDoor] Getting the first frames took 6.239 seconds.
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [warning] Guessed Channel Layout for Input Stream #0.0 : mono
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Input #0, sdp, from 'pipe:':
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Metadata:
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     title           : Talk
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Duration: N/A, bitrate: N/A
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Stream #0:0: Audio: aac, 16000 Hz, mono, s16
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Stream mapping:
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [graph_0_in_0_0 @ 0x2745fb0] [verbose] tb:1/16000 samplefmt:s16 samplerate:16000 chlayout:0x4
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [format_out_0_0 @ 0x2746720] [verbose] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [auto_resampler_0 @ 0x27479d0] [verbose] ch:1 chl:mono fmt:s16 r:16000Hz -> ch:1 chl:mono fmt:s16p r:16000Hz
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Output #0, mp3, to 'test.mp3':
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Metadata:
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     TIT2            : Talk
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     TSSE            : Lavf58.45.100
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Stream #0:0: Audio: mp3 (libmp3lame), 16000 Hz, mono, s16p, delay 1105
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Metadata:
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]       encoder         : Lavc58.91.100 libmp3lame
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    
[13/04/2023, 10:27:54] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       2kB time=00:00:00.37 bitrate=  35.1kbits/s speed=0.0364x    
[13/04/2023, 10:27:54] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       3kB time=00:00:00.88 bitrate=  29.7kbits/s speed=0.0815x    
[13/04/2023, 10:27:55] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       5kB time=00:00:01.42 bitrate=  28.1kbits/s speed=0.125x    
[13/04/2023, 10:27:55] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       7kB time=00:00:01.96 bitrate=  27.4kbits/s speed=0.165x    
[13/04/2023, 10:27:56] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       8kB time=00:00:02.46 bitrate=  27.0kbits/s speed=0.199x    
[13/04/2023, 10:27:56] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      10kB time=00:00:02.97 bitrate=  26.8kbits/s speed=0.231x    
[13/04/2023, 10:27:57] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      11kB time=00:00:03.48 bitrate=  26.6kbits/s speed=0.26x    
[13/04/2023, 10:27:57] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      13kB time=00:00:03.98 bitrate=  26.5kbits/s speed=0.286x    
[13/04/2023, 10:27:58] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      15kB time=00:00:04.52 bitrate=  26.4kbits/s speed=0.313x    
[13/04/2023, 10:27:58] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      16kB time=00:00:05.03 bitrate=  26.3kbits/s speed=0.337x    
[13/04/2023, 10:27:59] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      18kB time=00:00:05.53 bitrate=  26.2kbits/s speed=0.358x    
[13/04/2023, 10:27:59] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      19kB time=00:00:06.07 bitrate=  26.2kbits/s speed=0.38x    
[13/04/2023, 10:28:07] [CMD Accessory] PS5 is off.
[13/04/2023, 10:28:25] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      21kB time=00:00:06.61 bitrate=  26.1kbits/s speed=0.158x    
[13/04/2023, 10:28:26] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      23kB time=00:00:32.53 bitrate=   5.7kbits/s speed=0.766x    
[13/04/2023, 10:28:26] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      24kB time=00:00:33.04 bitrate=   6.0kbits/s speed=0.769x    
[13/04/2023, 10:28:27] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      26kB time=00:00:33.58 bitrate=   6.3kbits/s speed=0.772x    
[13/04/2023, 10:28:27] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      28kB time=00:00:34.08 bitrate=   6.6kbits/s speed=0.775x    
[13/04/2023, 10:28:28] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      29kB time=00:00:34.59 bitrate=   6.9kbits/s speed=0.777x    
[13/04/2023, 10:28:28] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      31kB time=00:00:35.10 bitrate=   7.2kbits/s speed=0.78x    
[13/04/2023, 10:28:29] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      32kB time=00:00:35.60 bitrate=   7.4kbits/s speed=0.782x    
[13/04/2023, 10:28:29] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      34kB time=00:00:36.11 bitrate=   7.7kbits/s speed=0.784x    
[13/04/2023, 10:28:30] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      35kB time=00:00:36.61 bitrate=   7.9kbits/s speed=0.787x    
[13/04/2023, 10:28:31] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      37kB time=00:00:37.15 bitrate=   8.2kbits/s speed=0.789x    
[13/04/2023, 10:28:31] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      39kB time=00:00:37.69 bitrate=   8.4kbits/s speed=0.791x    
[13/04/2023, 10:28:44] [Camera ffmpeg] [FrontDoor] Stopped video stream.
[13/04/2023, 10:28:46] [Camera ffmpeg] [FrontDoor] [Two-way] FFmpeg exited with code: null and signal: SIGKILL (Forced)

@longzheng
Copy link
Contributor

Did you ever get playable audio in your testoutput.mp3? I tried this test but the resulting ~/.homebridge/test.mp3 is 0kb. I think it may be permissions?...

[13/04/2023, 10:27:40] [Camera ffmpeg] [FrontDoor] Getting the first frames took 6.239 seconds.
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [warning] Guessed Channel Layout for Input Stream #0.0 : mono
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Input #0, sdp, from 'pipe:':
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Metadata:
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     title           : Talk
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Duration: N/A, bitrate: N/A
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Stream #0:0: Audio: aac, 16000 Hz, mono, s16
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Stream mapping:
[13/04/2023, 10:27:43] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [graph_0_in_0_0 @ 0x2745fb0] [verbose] tb:1/16000 samplefmt:s16 samplerate:16000 chlayout:0x4
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [format_out_0_0 @ 0x2746720] [verbose] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [auto_resampler_0 @ 0x27479d0] [verbose] ch:1 chl:mono fmt:s16 r:16000Hz -> ch:1 chl:mono fmt:s16p r:16000Hz
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info] Output #0, mp3, to 'test.mp3':
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]   Metadata:
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     TIT2            : Talk
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     TSSE            : Lavf58.45.100
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Stream #0:0: Audio: mp3 (libmp3lame), 16000 Hz, mono, s16p, delay 1105
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]     Metadata:
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info]       encoder         : Lavc58.91.100 libmp3lame
[13/04/2023, 10:27:53] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    
[13/04/2023, 10:27:54] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       2kB time=00:00:00.37 bitrate=  35.1kbits/s speed=0.0364x    
[13/04/2023, 10:27:54] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       3kB time=00:00:00.88 bitrate=  29.7kbits/s speed=0.0815x    
[13/04/2023, 10:27:55] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       5kB time=00:00:01.42 bitrate=  28.1kbits/s speed=0.125x    
[13/04/2023, 10:27:55] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       7kB time=00:00:01.96 bitrate=  27.4kbits/s speed=0.165x    
[13/04/2023, 10:27:56] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=       8kB time=00:00:02.46 bitrate=  27.0kbits/s speed=0.199x    
[13/04/2023, 10:27:56] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      10kB time=00:00:02.97 bitrate=  26.8kbits/s speed=0.231x    
[13/04/2023, 10:27:57] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      11kB time=00:00:03.48 bitrate=  26.6kbits/s speed=0.26x    
[13/04/2023, 10:27:57] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      13kB time=00:00:03.98 bitrate=  26.5kbits/s speed=0.286x    
[13/04/2023, 10:27:58] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      15kB time=00:00:04.52 bitrate=  26.4kbits/s speed=0.313x    
[13/04/2023, 10:27:58] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      16kB time=00:00:05.03 bitrate=  26.3kbits/s speed=0.337x    
[13/04/2023, 10:27:59] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      18kB time=00:00:05.53 bitrate=  26.2kbits/s speed=0.358x    
[13/04/2023, 10:27:59] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      19kB time=00:00:06.07 bitrate=  26.2kbits/s speed=0.38x    
[13/04/2023, 10:28:07] [CMD Accessory] PS5 is off.
[13/04/2023, 10:28:25] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      21kB time=00:00:06.61 bitrate=  26.1kbits/s speed=0.158x    
[13/04/2023, 10:28:26] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      23kB time=00:00:32.53 bitrate=   5.7kbits/s speed=0.766x    
[13/04/2023, 10:28:26] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      24kB time=00:00:33.04 bitrate=   6.0kbits/s speed=0.769x    
[13/04/2023, 10:28:27] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      26kB time=00:00:33.58 bitrate=   6.3kbits/s speed=0.772x    
[13/04/2023, 10:28:27] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      28kB time=00:00:34.08 bitrate=   6.6kbits/s speed=0.775x    
[13/04/2023, 10:28:28] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      29kB time=00:00:34.59 bitrate=   6.9kbits/s speed=0.777x    
[13/04/2023, 10:28:28] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      31kB time=00:00:35.10 bitrate=   7.2kbits/s speed=0.78x    
[13/04/2023, 10:28:29] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      32kB time=00:00:35.60 bitrate=   7.4kbits/s speed=0.782x    
[13/04/2023, 10:28:29] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      34kB time=00:00:36.11 bitrate=   7.7kbits/s speed=0.784x    
[13/04/2023, 10:28:30] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      35kB time=00:00:36.61 bitrate=   7.9kbits/s speed=0.787x    
[13/04/2023, 10:28:31] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      37kB time=00:00:37.15 bitrate=   8.2kbits/s speed=0.789x    
[13/04/2023, 10:28:31] [Camera ffmpeg] [FrontDoor] [Two-way] [info] size=      39kB time=00:00:37.69 bitrate=   8.4kbits/s speed=0.791x    
[13/04/2023, 10:28:44] [Camera ffmpeg] [FrontDoor] Stopped video stream.
[13/04/2023, 10:28:46] [Camera ffmpeg] [FrontDoor] [Two-way] FFmpeg exited with code: null and signal: SIGKILL (Forced)

Sorry it's been a while since I tested this, I don't quite remember if I listened to the MP3 itself but eventually the audio stream was heard on the intercom.

I've since switched to Scrypted for my Homekit cameras and has a very nice native ONVIF two-way audio integration that works perfectly with my intercom.

@MrSco
Copy link

MrSco commented Apr 14, 2023

ah, is ONVIF compatibility required? I'm trying this with an old Foscam 8910 IPCamera ... currently the only way to send audio to the camera's speaker is using the ancient IE activex plugin on the camera's webpage or with a couple ios/android apps Owlr and Foscam Pro. I was hoping to break out that functionality from those apps (they are sending audio to camera somehow.. but how?!) and be able to pipe whatever audio I'd like from any homebridge or shortcut automation i may have... give this old camera/intercom a new lease on life... but i'm hitting a dead end.

@longzheng
Copy link
Contributor

ah, is ONVIF compatibility required? I'm trying this with an old Foscam 8910 IPCamera ... currently the only way to send audio to the camera's speaker is using the ancient IE activex plugin on the camera's webpage or with a couple ios/android apps Owlr and Foscam Pro. I was hoping to break out that functionality from those apps (they are sending audio to camera somehow.. but how?!) and be able to pipe whatever audio I'd like from any homebridge or shortcut automation i may have... give this old camera/intercom a new lease on life... but i'm hitting a dead end.

ONVIF is not required but it's the easiest configuration since it is widely supported.

You can definitely write a custom plugin for the two-way audio compatibility, but I'm not sure what protocol you can use. I'm guessing you'll have to reverse engineer how they do it.

@0x5e
Copy link

0x5e commented Aug 1, 2023

If I press the "talk" button on Home App very late, there's no stream forward to "audioReturnPort", then ffmpeg process will exited after ~20s. I couldn't find any settings to extend the timeout waitting for the rtp package.
As far as I know, we are not able to get "talk button pressed" event, anyone have ideas about how to resolve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants