-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
occ preview:pre-generate don't stop after hours of execution #62
Comments
Depending on the amount of pictures and the server system, this can easily take that long. Scanning my whole picture gallery also took several days with Raspberry Pi 2, but ends by itself after all. Could you verify that it scans the same picture/folder several times on one execution, by writing the verbose results to a log file and check it after a while? If you use "pre-generate" instead of "generate-all" you could also just watch the last scanned folders/pictures, manually stop and restart the process and see on which folder/file it goes on. Should be at the same point instead of the beginning. |
Yesterday I have used "pre-generate" and it scans the same pictures more than one time. |
I'm having the same problem (nextcloud 12.0.0), big image gallery (30.000 files), I also observed the same file being processed over and over again ... With some database research and a lot of debugging here's what I found: In the database table oc_preview_generation - that's the one listing the files/directories to be processed during "occ preview:precreate" I've got over 8000 entries. Each time I started "occ preview:precreate" the number of entries does decrease only by 1 or 2 (even after hours/days!). So I took a look into this table and I found over 1000 entries pointing to my main users root folder (containing the above mentioned files), and some other duplicate file/directory references. Looking at the code, "pregenerate" seems to just take the list from the database and to do a sequential processing ... so I'm facing about 1000 iterations over all of my files. No wonder it takes that long, and process is hardly noticeable. I see two possible fixes for this (no idea which one is better):
|
Update: After removing the duplicate entries from database, a final run did take about 1 hour to generate the missing previews -> table "oc_preview_generation" is now empty (as it should be). In addition to the above "possible fixes" I'd like to add some more optimization options:
And (the other way around):
(For performance reasons it may be preferable to do this optimizations during "precreate", not during normal work) |
@linuxrrze I see that the table oc_preview_generation has 3 fields: id, uid, file_id. So I guess you remove all entries that have identical uid and file_id? Could you give me the statement that you use to do this? To the author: why not making (uid, file_id) unique? Something like "ALTER IGNORE TABLE |
@bavay: Yes, that's what I did. However I'm not a database expert and therefore I had no idea how to achieve this in SQL only. I did write a few lines of python code, reading the complete oc_preview_generation table from postgres database, sorting the returned entries and then removing all duplicate entries one by one. I also had only one relevant user, so there was no need to check the uid. I can provide the python code if that helps, however as I mentioned it's postgres specific. |
@linuxrrze : I ended up doing the cleanup manually (it was many duplicates of the same ~10 files, so it was not too painful). However, the whole "re-generate" process is turning into a nightmare: it takes for ever, it seems that there is always something happening before this table gets cleaned up (for example, I get "1205 Lock wait timeout exceeded" on Mysql after a few hours, so I have to restart the whole process and I still don't know which files I could remove from mysql and which ones have not yet been processed. it moves a little bit further before stopping again for the same reason or another one). I'm just dreaming of a shell script that would generate the proper previews with the proper naming, but much faster... |
I just searched a little more, and I guess the following SQL should do the trick: List duplicates:
Delete duplicates:
I'm currently "clean", so I could not verify if this works as expected ... |
By "precreate" or "re-generate" you guys actually mean "occ preview:pre-generate", right? Or did I miss some hidden additional command? 😄 Did you run the command in verbose mode Would be nice by the way, if someone could verify that the timestamps are always in UTC and hopefully someone can even answer why this is the case, as the used |
Yes the command that took virtually forever was I did log the output of However from analyzing these logs, I found the same files being processed over and over again. There was no specific file taking long, it was the total amount of files (multiplied with the number of iterations). As mentioned above, removing duplicate entries in the database table |
Hmm I am just wondering, because if previews were already generated for a file and it is processed again, generation just jumps over that file quite fast (compared to, if the previews really need to be generated/are not already there). But yeah if it's about Development and handling of pull requests did pause for a while. I hope rullzer find a time to go through this topic soon 😃 . |
After running for a few days, the pre-generation of previews crashed with the message below (I had applied the patches of @MichaIng to my version of the code). I guess this happened when I went into nextcloud and opened the gallery for a folder with not-yet-generated previews (so the preview generation got called on top of the background preview generation). After a few minutes, I noticed that the hard drive was not spinning anymore so I guess this is the time the background job crashed. ==================================== Exception trace: [Doctrine\DBAL\Driver\PDOException (HY000)] |
@bavay the solution for you is just to re-run the pre-generate command. There should be nothing broken and it can just continue where it stopped. But, I didn't watch errors opening pictures and folders in web ui, while running pre-generate/generate-all, even with missing previews or while for exact that folder previews are pre-generated. Went through this several times ;). Would be also bad if this would generally produce errors, because pre-generate is meant to run as cron job quite often.
I am also not sure if there is some sort of query for the generation command itself, or if several generations (by command + by web ui access) can run parallel, that they might disturb each other in terms of system resources. Maybe @rullzer can have a look here? |
I'll have a look soonish (tm). But as you all probabaly guessed I'm busy. There are some useful suggestions in this thread. And the best explanation I have for most of the shortcommings is that this app was done rather quick. And that it runs perfectly on my own hardware so I probabaly don't run into most of your issues. I think it is a bug to add the rootfolder into the list. As that will always trigger a complete rescan. I'll look into that. As for some of the crashed. Yes that can happen. ALso due to faulty images etc. That is why this app can just be start again and it will continue. And if the preview generation has not caught up yet we will just do on demand preview generation. As pointed out. If a preview has been generated and we do request it again. This should be very quick. So long story short. I'll give this some more love once I find some time. |
Another piece of the puzzle: I realized that my oc_file_locks table was huge (thousands of items). After deleting its content (in maintenance mode, see https://help.nextcloud.com/t/file-is-locked-how-to-unlock/1883) the whole thing is much faster. This has nothing to do with the preview generator (well, except if it left locks behind) but this was part of the solution anyway... |
Hi,
after 2 days that the process runs for hours, I have start the command with verbose flag:
occ preview:pre-generate --force -vvv
it don't stop and seems analyze continuously and circularly all the images.
The text was updated successfully, but these errors were encountered: