Skip to content
This repository has been archived by the owner on Sep 16, 2023. It is now read-only.

Closing and opening files leads to Iina to crash and not respond #17

Open
1 task done
ViRo3 opened this issue Jan 8, 2022 · 61 comments
Open
1 task done

Closing and opening files leads to Iina to crash and not respond #17

ViRo3 opened this issue Jan 8, 2022 · 61 comments
Labels
bug Something isn't working upstream MPV, ffmpeg, etc.

Comments

@ViRo3
Copy link

ViRo3 commented Jan 8, 2022

When files are opened and closed successively, Iina freezes on opening a file and then doesn't respond.

System and IINA version:

  • macOS 12.1
  • IINA v1.2.0-21

Expected behavior:
Doesn't crash/freeze

Actual behavior:

Iina crashes and freezes on quickly opening and closing files.

Crash report:

Sample of IINA.txt
Spindump.txt
Spindump2.txt

Steps to reproduce:

Keep opening and closing a file (doesn't require much but 3 cycles for me)

  • MPV does not have this problem.

How often does this happen?
Always

PS : File responsible hxxps://nyaa<.>si/view/1390968

@low-batt
Copy link
Collaborator

low-batt commented Jan 8, 2022

Investigating.

@low-batt
Copy link
Collaborator

IINA v1.2.0-21 is from 6 days ago and is missing the fix for my fix from 4 days ago. This report got me to look at the fix again and I've just pushed another commit that tweaks it some more to hopefully make really sure the hang is fixed.

@CarterLi Please create a new build when you get a chance.

@CarterLi
Copy link
Owner

CarterLi commented Jan 10, 2022

IINA v1.2.0-21 is from 6 days ago and is missing the fix for my fix from 4 days ago. This report got me to look at the fix again and I've just pushed another commit that tweaks it some more to hopefully make really sure the hang is fixed.

@CarterLi Please create a new build when you get a chance.

Done

I'd like to use Github Actions building M1 versions when supported. Ref: actions/runner-images#2187

@ViRo3
Copy link
Author

ViRo3 commented Jan 10, 2022

Sample of IINA.txt

Spindump.txt
@low-batt still crashes with the file linked above (would rec downloading it if possible)

@low-batt
Copy link
Collaborator

@ViRo3, @CarterLi New plan. Stablize IINA-Plus:

I've just pushed a commit that backs out the changes made to fix iina#3601.
Need a new build.

The issue iina#3601 is about wasting a minor amount of energy. Desirable to eventually fix, but not critical to fix it right now with the "fix" causing so much trouble. Unfortunately the timing on my Mac is such that so far I've not been able to reproduce these hangs. So I've been taking a "stab in the dark" as to what is going wrong and how to fix it. Clearly that is not working. More important to focus on other issues. I alway hate to give up on fixing an issue but that seems like the right decision on this one for now.

I did leave in one change made as a part of this fix which was to add guarding that videoPosition is available in PlaybackInfo.constrainVideoPosition:

guard let duration = videoDuration, let position = videoPosition else { return }

That seems like a safe useful change. The change that was triggering the problem is the attempt to stop and start the timer that synchronizes the OSC based on whether the OSC is visible or not. Backing out the change puts IINA back to updating the OSC even though it is not visible to the user. This wastes some energy, but the vast majority of CPU time consumed is in the code that displays the video.

@low-batt
Copy link
Collaborator

By the way, there is an IINA hang out there lurking, the issue "Main thread hang in mpv_render_context_create during startup". So if IINA hangs during startup don't assume it was one of these recent issues. Always need a sample of IINA to have a chance of knowing what the issue is. I hit that hang once with the released version of IINA. I also hit it once with a development version of IINA using mpv 0.34.0. I made a concerted effort to reproduce it, running a test designed to trigger it thousands of times. No luck. Whatever that problem is, it is rare.

@ViRo3
Copy link
Author

ViRo3 commented Jan 10, 2022

Yes, its a very specific file that completely freezes Iina. (Have link it above) so as such it isn't really a major issue

@CarterLi
Copy link
Owner

New binary released. Please test it

@ViRo3
Copy link
Author

ViRo3 commented Jan 12, 2022

CarterLi added a commit that referenced this issue Jan 12, 2022
@CarterLi
Copy link
Owner

So It's not caused by fixing iina#3601.

Please test this version: https://github.com/iina-plus/iina/releases/tag/v1.2.0-25

@low-batt
Copy link
Collaborator

I'm currently tired and may not be thinking straight...

From a quick look I am stunned and confused. The process sample shows the same main thread lockup with mpv_get_property waiting for a lock when called by PlayerCore.syncUI. I was expecting that was somehow due to my "fix" that changed how that method was used. I quickly checked MainWindowController on GitHub and it looked like I correctly backed out that fix.

This movie plays fine with the released verion of IINA? The hang only occurs with IINA Plus?

There is one other change in IINA Plus that has to do with PlayerCore.syncUI. The timer that runs this method to keep the UI in sync is now paused and restarted as you pause and restart the video. IINA Plus also has the changes related to the OpenGL context. I was not thinking this was related to that change as it is obvious in the spin dump when a thread is stuck waiting to lock the context. Of course the big change is that IINA Plus is using the latest mpv release, 0.34.1.

When exactly is this hang triggered? When it hangs what is the video doing? Just freezes?

I will be taking a closer at the other threads in the spin dump to see if I can spot something that looks wrong. Might have to pull some of these other fixes and see what effect that has on the problem. I guess we first see if the new build makes any sort of difference.

@CarterLi
Copy link
Owner

I encountered freezing after using DispatchQueue.main.async once and I did add a DispatchQueue.main.async block recently.

I just removed it. Let's see what's happening next

@low-batt
Copy link
Collaborator

@ViRo3 Are you able to play the video using mpv without problems?

I'm thinking we should take a look at the IINA and mpv log files. Usually log files don't help that much with hangs. But if there is trouble with playback we might find some clues in the mpv log.

The sample shows the main thread completely locked up:

    +                             2257 __NSFireTimer  (in Foundation) + 104  [0x18a23e29c]
    +                               2257 @objc PlayerCore.syncUITime()  (in IINA) + 28  [0x1004595d4]
    +                                 2257 PlayerCore.syncUITime()  (in IINA) + 56  [0x1004594e4]
    +                                   2257 PlayerCore.syncUI(_:)  (in IINA) + 584  [0x10045982c]
    +                                     2257 mpv_get_property  (in libmpv.1.dylib) + 92  [0x103664bec]
    +                                       2257 mp_dispatch_lock  (in libmpv.1.dylib) + 188  [0x1036433cc]
    +                                         2257 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1228  [0x18925d808]
    +                                           2257 __psynch_cvwait  (in libsystem_kernel.dylib) + 8  [0x1892250c0]

As I had just added a change involving PlayerCore.syncUI I assumed that was triggering the problem. Since backing that out didn't work I've been digging into mpv code.

This is the mpv method that is being called by IINA code:

int mpv_get_property(mpv_handle *ctx, const char *name, mpv_format format,
                     void *data)
{
    if (!ctx->mpctx->initialized)
        return MPV_ERROR_UNINITIALIZED;
    if (!data)
        return MPV_ERROR_INVALID_PARAMETER;
    if (!get_mp_type_get(format))
        return MPV_ERROR_PROPERTY_FORMAT;

    struct getproperty_request req = {
        .mpctx = ctx->mpctx,
        .name = name,
        .format = format,
        .data = data,
    };
    run_locked(ctx, getproperty_fn, &req);
    return req.status;
}

That method is stopped calling run_locked:

// Run a command in the playback thread.
static void run_locked(mpv_handle *ctx, void (*fn)(void *fn_data), void *fn_data)
{
    mp_dispatch_lock(ctx->mpctx->dispatch);
    fn(fn_data);
    mp_dispatch_unlock(ctx->mpctx->dispatch);
}

That method is stopped calling mp_dispatch_lock:

// Grant exclusive access to the target thread's state. While this is active,
// no other thread can return from mp_dispatch_lock() (i.e. it behaves like
// a pthread mutex), and no other thread can get dispatch items completed.
// Other threads can still queue asynchronous dispatch items without waiting,
// and the mutex behavior applies to this function and dispatch callbacks only.
// The lock is non-recursive, and dispatch callback functions can be thought of
// already holding the dispatch lock.
void mp_dispatch_lock(struct mp_dispatch_queue *queue)
{
    pthread_mutex_lock(&queue->lock);

So the main thread is blocked trying to run a get property request on the playback thread. Maybe this particular video is causing trouble for the playback thread.

@low-batt
Copy link
Collaborator

@CarterLi I submitted a PR for IINA that has a simple workaround for iina#3211. Was not sure if we are currently holding off adding changes to IINA+ until we sort this issue? Or do you want me to add that fix?

By the way Daniel on Telegram said nice things about IINA+ HDR support.

@ViRo3 I've been hoping you would have time to see if you can reproduce the same problem in the mpv player. If mpv plays it without issues we focus more on how we are configuring mpv and interacting with it. If mpv has trouble that would explain our trouble trying to track down the problem.

@CarterLi
Copy link
Owner

CarterLi commented Jan 17, 2022

@CarterLi I submitted a PR for IINA that has a simple workaround for iina#3211. Was not sure if we are currently holding off adding changes to IINA+ until we sort this issue? Or do you want me to add that fix?

Like I said before, this repo is built for experimenting latest features / bug fixes that may not be very stable. You may just push your changes. We can revert the change if it causes serious issues.

@low-batt
Copy link
Collaborator

I was nervous due to the issues with my previous fix. This fix is small and should be safe. Just finished pushing it.

@CarterLi
Copy link
Owner

I was nervous due to the issues with my previous fix. This fix is small and should be safe. Just finished pushing it.

I'm very sorry that I removed your commit with a force-push by accident. Can you push it again?

@low-batt
Copy link
Collaborator

Not a problem at all. I've pushed the fix to VideoView again.

@CarterLi
Copy link
Owner

Can you build iina-plus yourself now? You may make a new release if necessary @low-batt

@low-batt
Copy link
Collaborator

It is painful for me to build the dependencies part as I operate under a limited account that does not have the access needed for brew, so I have to build that part under an admin account. I just tried and the build worked up to the very end which failed because it could not find create-dmg. I see some shell scripts with that name on GitHub. Where did you get it?

@CarterLi
Copy link
Owner

CarterLi added a commit that referenced this issue Jan 19, 2022
CarterLi added a commit that referenced this issue Jan 21, 2022
low-batt pushed a commit that referenced this issue Jan 21, 2022
@John-He-928
Copy link

The problem is still reproducible right now.

Tested on version 1.2.0-22, 1.2.0-28 and 1.2.0-29. It's okay on IINA 1.2.0.

@low-batt
Copy link
Collaborator

Unfortunately freezes can be very senstive to the specific timing of hardware and configuration and the like. So far I've not been able to reproduce this hang which makes it much harder to debug.

There is a ton of differences between IINA 1.2.0 and IINA+ builds. One major difference is that IINA+ uses the latest mpv. There are some known hangs in mpv. We are hoping someone can test and see if mpv is working fine on the same files or not just to confirm that it is something IINA is doing rather than something in the new mpv.

Hangs are hard to debug. With a hang we always need a process sample. To take a process sample bring up the Activity Monitor while IINA is frozen. Select IINA in the Activity Monitor. Then under the View menu select Sample Process. That should create a report like you see above. From that we should be able to tell if you are hitting the same freeze or not. It might or might not provide an additional clue to the problem.

@John-He-928
Copy link

John-He-928 commented Jan 23, 2022

I made more tests. It seems to be 100% reproducible for my environment, which is MBP M1 Pro.

Testing clip is LG Earth Dolby Vision Demo, but it actually freezes on many of my clips, if not all.

I tested mpv 0.34.1 prebuilt from here. It successfully quits without hangs or freezes. It's an x86_64 build. Sorry but I'm not able to find an arm64 binary distribution.

I made a process sample and pasted the text to this gist.

iina.log
mpv.log

Config file used:

profile=gpu-hq
icc-profile-auto=yes
gamma-auto=yes
sub-auto=fuzzy
sub-codepage=utf8:gb18030
osd-font='PingFangSC-Light'
sub-font='PingFangSC-Light'
autofit-larger=90%x80%
hwdec=auto

Disable loading mpv config file doesn't help.

@low-batt
Copy link
Collaborator

Understood. As you can see from that mpv manual clip the mpv project has encountered a lot of problems with hardware drivers and takes a dim view of them. We seemed to have stumbled across one of the many issues.

@ViRo3
Copy link
Author

ViRo3 commented Jan 30, 2022

@John-He-928 @ottob @ViRo3 The following is my attempt to give you some insight into the big picture of what you have been helping with without too much programmer speak. Nothing here you need to read. This is post is just in case you are curious about software development. Here is an analogy of what has been going on with IINA+:

Imagine a tiny town with no stop signs or traffic lights. For years drivers have driven right through intersections without looking for other cars. There have never been any accidents. Then one day there is a horrible collision in an intersection. The mayor asks how this could have happened. The police chief reports that it used to be only two residents owned cars, but recently 8 more residents bought cars. The mayor recognizes they had better add traffic controls, so the order goes out for stop signs to be installed.

The next day there is a collision. The mayor is shocked and asks the police chief what happened. The police chief reports a car stopped for one of the new stop signs and the car behind drove right into the stopped car. Not all drivers had gotten the message about stop signs being installed. The mayor orders public service announcements so all drivers know about the new stop signs.

The next day there is a collision. The mayor is shocked and confused and asks the police chief how that could possibly have happened now that proper traffic controls are in place and all drivers know about them. The police chief reports that again a car rear ended another car that had stopped at one of the new stop signs. The driver that failed to stop had been driving around for years without any problems even though the brakes on the car were failing. The addition of the stop signs turned the poorly functioning brakes into a serious problem. The mayor now has to order that all cars undergo inspections.

How this analogy relates to computing has to do with how computer applications perform multiple tasks at once. To do this they assign an individual task to a Thread. Usually threads need to communicate with each other. For example one thread might write a list that the other thread needs to read. Tying this back to the analogy, threads are the cars and the data they share are the intersections. A collision occurs when one thread tries to read that list at the same moment another thread is writing to that list. In programming there are various kinds of "traffic controls" that must be used to prevent such thread conflicts. As in the analogy an application can have such programming errors and work fine, with only an occasional "glitch" when the timing is just right and the threads collide. What happens when threads collide is usually considered "undefined". Maybe works. Maybe malfunctions. Maybe corrupts memory. Maybe crashes. It is the responsibility of the programmer to properly coordinate thread interaction and insure this can't possibly happen.

To understand how threads run we need to tweak the analogy a little. Imagine the town always had a lot of cars, but none of the cars had engines. One person owned all the engines and would rent them to a driver for a short time. That person used to only own two engines, so only two cars were ever actually moving at once. That person has now has 10 engines, 8 of them high performance V8s and two fuel saving inline 4s. More cars moving at once has increased the chances of collisions if proper traffic controls are not in place.

The "engines" in a computer are the CPU cores and it is macOS that "rents" a core to a thread to allow it to run. How many threads are actually running at the same time depends up the capabilities of the hardware and how macOS decides to make use the hardware. It is my own contention that with the advent of the M1 and M1 Pro / M1 Max processors more users will have Macs that can run many threads at once, potentially making festering defects related to thread coordination more likely to trigger problems. Software engineers should always have a zero tolerance for such problems. With the rollout of these processors it has become critical to fix such defects.

Since such thread related defects are dependent upon specific hardware characteristics and timing of the running of threads, they can manifest as rare "glitches". Users may not even report them if they rarely experience problems and can't reproduce the problem. This behavior also causes trouble for software engineers. Programmers would prefer to be able to deterministically reproduce a problem so that they can confirm a proposed fix works.

Reviewing code and issues shows that IINA has some problems with thread coordination that need to be fixed. The issue behind the problem reported in IINA+ issue 11 is the root cause of 7 IINA issues. I was never able to reproduce those crashes. I had to figure out the problem from the crash reports and reading code. There are other threading issues that I have been able to reproduce and fix.

So now you know who the mayor is in the analogy. I am the one that has been trying to fix these issues and triggering additional trouble with my fixes. Sorry about that! I am still digging through the code for such problems. There may be some more that need to be fixed.

There is something else you should have noticed from this post. YOU have been doing software development. The things we have been doing together to gather evidence, test out configuration changes, etc. is how developers investigate and get to the root of a non-trivial problem. You have been contributing to IINA. Thank you yet again for your efforts.

Fingers crossed further fixes won't cause such problems. But this is software, so no promises. There are other hangs lurking. I definitely suspect a hang in the mpv logger. I will be trying to reproduce that and get to the bottom of it.

I could say a lot more, but this post is already too long too read.

I hate parallelism.
Great explanation! Really makes it simple to understand, appreciate it!
The more I read it, the more am amazed at how simple you made it! Kudos!

@ViRo3
Copy link
Author

ViRo3 commented Jan 31, 2022

@low-batt Latest builds seem to Intel and not universal ?

@low-batt
Copy link
Collaborator

low-batt commented Feb 1, 2022

Warning. Dealing with a big winter storm. I'm totally exhausted from snow shoveling. This post might be incoherent...

Thanks for telling me. I missed that. @CarterLi did not educate me on how to post a release. I was able to figure out some of it from GitHub documentation. Looking at the README I don't think he has been building a universal binary? At the end are links to GitHub actions, the CI build for the Intel binary and to the releases section for the Apple Silicon build. I missed that. The build I posted was from the GitHub CI build, seems to be Intel only. Not sure how CarterLi is creating the Apple Silicon build. Maybe building by hand on his machine? I could build one that way. It would only be good for macOS 12.1+ due to the build issue #31.

I need to update my PRs in the main IINA repository with the fixes made to IINA+. Then I need to rebase IINA+ to remove the commits that were adding debug messages to avoid future merge conflicts when the fork is updated from the main. I also need to restore a bug fix that was pulled as we initially thought it might be causing this hang. Once I get that done I'll need to create a new IINA+ release. I'll be sure to post both versions when making that release.

@ViRo3
Copy link
Author

ViRo3 commented Feb 1, 2022

wow. Take it easy and slow, batt.No problem at all with you taking it slow. Really appreciate your efforts into the project and its sustenance along with the amazing responses as you work on them.

@CarterLi
Copy link
Owner

CarterLi commented Feb 2, 2022

@CarterLi did not educate me on how to post a release. I was able to figure out some of it from GitHub documentation. Looking at the README I don't think he has been building a universal binary? At the end are links to GitHub actions, the CI build for the Intel binary and to the releases section for the Apple Silicon build.

Indeed.

I'm investigating how to build universal binaries ( for 3rd party dylibs ). However before that, we must release them independently. For M1 version, we have to build them on our local machine.

@low-batt
Copy link
Collaborator

low-batt commented Feb 2, 2022

I'm currently trying to fix build issue #31. If that goes well I will release a new build and correctly post a M1 version.

@low-batt
Copy link
Collaborator

low-batt commented Feb 5, 2022

A new release has been published containing IINA+ built for Apple Silicon.

@CarterLi CarterLi added the bug Something isn't working label Feb 9, 2022
@CarterLi
Copy link
Owner

CarterLi commented Feb 9, 2022

@ViRo3 Can you please verify if the latest fix work for you?

@ViRo3
Copy link
Author

ViRo3 commented Feb 9, 2022

There are two bugs referenced here . The one related to iina has been; the other is mpv.
thus, issue should be closed imo

@low-batt low-batt added the upstream MPV, ffmpeg, etc. label Feb 10, 2022
@low-batt
Copy link
Collaborator

The hang this issue was opened for has now been reproduced in mpv when hardware decoding is enabled. Using software decoding works. So this is an "upstream" issue for IINA. It might be an "upstream" issue for mpv as well as it appears to be a problem in the hardware drivers. I've added the upstream label.

The second hang was a regression due to my changes to correct shutdown issues. That turned out to be two problems. I did not know how to trigger one of the ways you can quit IINA. Turns out you type "q" in the IINA window. The hang seems to be a problem in AppKit. A fix that corrects these problems has been merged into IINA+.

@CarterLi, do you want leave "upstream" issues open? If not then this one can be closed.

@CarterLi
Copy link
Owner

Lets leave them open for now

CarterLi added a commit that referenced this issue Feb 18, 2022
CarterLi added a commit that referenced this issue Feb 21, 2022
CarterLi added a commit that referenced this issue Feb 28, 2022
CarterLi added a commit that referenced this issue Apr 23, 2022
low-batt pushed a commit that referenced this issue Apr 23, 2022
low-batt pushed a commit that referenced this issue Apr 24, 2022
low-batt pushed a commit that referenced this issue Apr 27, 2022
CarterLi added a commit that referenced this issue Apr 28, 2022
low-batt pushed a commit that referenced this issue May 1, 2022
CarterLi added a commit that referenced this issue May 5, 2022
CarterLi added a commit that referenced this issue May 9, 2022
CarterLi added a commit that referenced this issue May 17, 2022
low-batt pushed a commit that referenced this issue May 28, 2022
CarterLi added a commit that referenced this issue Jun 8, 2022
CarterLi added a commit that referenced this issue Jun 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working upstream MPV, ffmpeg, etc.
Projects
None yet
Development

No branches or pull requests

5 participants